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CHAPTER  1 
INTRODUCTION 

Digital  image  processing  has  gained  considerable  importance  due  to  its  numerous  applications 
in  the  aerospace,  biomedical,  commercial  television  and  video  teleconferencing  systems.  The 
availability  of  super-fast  chips  for  digital  data  processing  has  made  the  hardware  implementation 
of  the  image  processing  algorithms  feasible  for  satellite  applications  due  to  the  reduction  achieved 
in  weight,  size  and  power  consumption.  A  considerable  amount  of  work  done  in  the  area  of  image 
processing  has  focused  on  coding,  bandwidth  compression  and  pattern  recognition. 

In  the  area  of  image  processing  on  board  a  satellite,  the  usual  objectives  are  image 
enhancement,  efficient  encoding  to  reduce  the  transmission  or  storage  capacity  requirements  and 
pattern  recognition  for  the  purpose  of  extraction  of  certain  feature  points.  In  this  work,  we  focus 
on  a  different  aspect  of  digital  data  processing.  Here,  we  are  concerned  with  the  estimation  of  image 
data  using  past  statistics.  Specifically,  we  are  interested  in  an  on-line  prediction  of  the  next  few 
frames  of  a  video  sequence  using  the  available  frames.  The  problem  then  is  that  of  parameter 
identification  of  a  time-varying  system  using  a  priori  knowledge.  For  this  purpose,  we  apply 
estimation  theory  concepts  and  derive  a  fixed  predictor  as  well  as  one  that  is  adaptive,  i.e.,  one 
which  predicts  frames  by  analyzing  that  data  which  was  received  most  recently.  In  the  former,  a 
fixed  prediction  algorithm  is  used  whereas  in  the  latter  it  is  based  on  the  most  recent  data.  The 
specific  application  considered  is  that  of  a  remotely  piloted  vehicle  where  a  man-in-the-loop  uses 
images  relayed  by  a  spacecraft  in  orbit  for  remotely  maneuvering  the  vehicle.  The  prediction  of 
the  image  data  is  expected  to  enhance  the  pilot’s  ability  to  maneuver  the  vehicle  by  compensating 
for  the  data  which  are  either  corrupted  by  channel  noise  or  lost  because  of  a  temporary  loss  in  the 
communication  link.  Otherwise,  the  estimates  of  the  next  frames  impart  added  knowledge  about 
the  scene  and  the  target  movement,  and  the  resulting  smoothing  effect  is  expected  to  aid 
s.gnificantly  in  the  piloting  operation. 
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A  brief  description  of  the  other  areas  of  research  is  presented  below  for  a  contrast  with  our 
objective.  The  area  of  image  enhancement  is  concerned  with  restoring  the  quality  of  the  pictures, 
which  may  be  degraded  because  of  a  noisy  satellite  channel,  or  enhancing  the  contrast  for  a  better 
scene  interpretation.  The  results  of  image  enhancement  work  done  at  the  Jet  Propulsion 
Laboratory  for  improving  the  picture  quality  of  the  pictures  of  the  Moon  and  Mars  are  well 
known.  Pattern  recognition  is  another  important  area  of  research,  and  refers  to  extraction  of 
patterns  or  other  information  from  images.  Its  applications  are  in  the  area  of  biomedical 
engineering,  automatic  mapping  of  earth  resources  from  satellite  photographs,  etc.  Video 
bandwidth  compression  is  an  area  that  has  received  a  lot  of  attention  and  is  concerned  with  the 
problem  of  bandwidth  constraint  either  for  storage  or  transmission.  For  instance,  the  bandwidth 
available  from  a  spacecraft  for  real-time  transmission  is  severely  limited  due  to  the  total  weight, 
equipment  si2e  and  power  constraints.  This  necessitates  compressing  of  the  image  data  into  a  much 
smaller  bandwidth,  simultaneously  minimizing  any  degradation  in  the  received  picture  quality. 
Bandwidth  compression  techniques  seek  to  achieve  this  by  removing  the  redundancy  inherent  in  an 
image,  or  a  sequence  of  images,  both  in  the  space  domain  as  well  as  the  time  domain  so  that  the 
image  can  be  represented  by  a  smaller  bandwidth.  A  considerable  amount  of  work  in  the  image 
processing  area  has  focused  on  this  problem.  Seyler  [l]  describes  a  coding  technique  to  reduce  the 
channel  capacity  requirement.  The  applications  of  predictive  coders  and  transform  coders  are  well 
known.  The  former  exploit  spatial  and  temporal  redundancies  in  the  data,  and  the  latter  transform 
the  image  data  into  the  frequency  domain,  and  achieve  compression  by  exploiting  the  fact  that  the 
human  eye  is  sensitive  only  to  changes  in  the  lower  frequency  coefficients.  Hybrid  techniques  [2] 
have  also  been  widely  applied  since  they  use  a  combination  of  algorithms  to  achieve  the  best 
compromise  between  implementation  complexity  and  performance.  Adaptive  compression 
techniques  are  important  due  to  their  ability  to  monitor  their  performance  and  inject  a  feedback 
term  in  their  algorithm,  to  adapt  themselves  to  changes  in  the  scene  statistics.  This  makes  them 
more  robust  than  their  nonadaptive  counterparts.  Ericsson  [3]  reports  good  results  when  applying 
adaptive  predictors  rather  than  fixed  predictors  for  bandwidth  reduction  via  interframe  coding.  A 
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survey  of  adaptive  image  coding  techniques  is  given  in  [4],  For  applications  involving  feedback 
control  systems  and  bandwidth  compression  systems,  great  improvements  in  performance  have 
been  reported  with  the  use  of  adaptive  techniques.  This  was  the  motivation  to  apply  this  approach 
for  image  data  prediction. 

As  stated  above,  this  work  addresses  a  different  problem.  Here,  the  dynamic  estimation 
problem  deals  with  the  determination  of  image  pixel  intensities  of  a  video  frame  based  upon  those 
of  the  frames  already  received,  i.e..  estimation  of  unknown,  time-varying  parameters  using 
measurable  data.  The  motivation  for  this  work  is  as  follows:  A  teleoperator-based  remotely 
piloted  spacecraft  transmits  video  images  obtained  from  the  on-board  cameras  to  a  ground  control 
station  for  scene  and  target  interpretations.  A  man-in-the  loop  (the  pilot  on  the  ground  control 
station)  relies  on  the  images  transmitted  from  the  spacecraft  for  maneuvering  it  near  a  target 
spacecraft  for  surveillance.  In  the  case  of  a  disabled  spacecraft,  the  aims  are  rendezvousing  and 
docking  for  the  purpose  of  retrieval.  In  this  application,  knowledge  of  the  next  few  frames  of  the 
video  sequence  could  greatly  enhance  the  pilot's  perception  of  the  scene  and  target  motion,  and  thus 
aid  significantly  in  the  remote  piloting  operation. 

An  added  motivation  is  due  to  the  problem  created  by  a  temporary  loss  in  the  communication 
link  between  the  spacecraft  and  the  ground  control  station.  As  shown  in  Figure  1.  the  spacecraft  in 
orbit  modulates  the  digitized  video  signal  onto  a  radio  frequency  (RF)  carrier  and  transmits  it  to  a 
communication  satellite  such  as  NASA's  Telecommunications  and  Data  Relay  Satellite  (TDRS). 
The  TDRS  receives  the  signal,  amplifies  it,  remodulates  it  onto  another  RF  carrier  and  transmits  it 
to  the  ground  control  station.  In  a  situation  where  the  target  spacecraft  is  spinning,  the  parent 
spacecraft  will  have  to  spin  up  to  the  same  rate  to  be  able  to  dock  with  the  target.  A  spinning 
spacecraft  would  have  to  alternate  between  its  antennas  for  transmission  of  the  data  as  the  relay 
satellite  moves  out  of  the  field  of  view  (FOV)  of  one  antenna  and  into  that  of  the  other.  The  FOV 
of  the  satellite  antennas  is  usually  limited  by  the  antenna  size  and  weight  constraints.  It  is 
conceivable  that  during  a  part  of  the  rotation,  the  communication  satellite  may  not  be  within  the 


FOV  of  either  antenna.  This  would  result  in  a  temporary  loss  in  the  communication  link  resulting 
in  the  loss  of  a  few,  frames.  A  poor  link  performance  (bit  error  rate  of  the  channel)  could  also 
potentially  result  in  some  loss  of  data,  impairing  the  pilot’s  ability  to  maneuver  the  vehicle, 
resulting  in  either  a  loss  of  the  mission,  the  spacecraft,  or  both.  In  such  cases,  reconstruction  of  the 
missing  frames  using  the  past  statistics  could  avert  a  catastrophic  failure.  The  problem  then  is  not 
only  filtering  of  the  data  based  on  the  frames  received,  but  also  that  of  predicting  the  nest  few 
frames.  Even  if  there  were  no  missing  frames,  image  data  prediction  offers  smoothing  cf  the  data 
which  would  considerably  increase  the  pilot's  perception  of  the  scene. 

For  the  sake  of  image  data  restoration,  the  simplest  solution  may  appear  to  be  a  frame  refresh 
based  upon  the  last  frame  received.  This  work  seeks  to  exploit  the  statistical  correlation  between 
the  pixels  of  adjacent  frames  of  a  video  sequence  for  a  more  accurate  prediction  of  the  images. 
Specifically,  a  fixed  and  an  adaptive  predictor  are  utilized.  Underlying  both  of  these  approaches  is 
the  problem  of  parameter  estimation  of  time-varying  parameters.  For  solving  the  prediction 
problem,  we  first  represent  the  image  sequence  as  a  discrete-time  linear  state  vector  model.  The 
challenge  presented  by  this  approach  of  using  fixed  and  adaptive  frame  predictions  of  a  video 
sequence  based  upon  the  past  frames  received  is  in  modeling  the  scene  dynamics  and  representing 
the  image  processing  problem  as  a  state  vector  model.  When  the  system  model  is  completely 
specified,  standard  parameter  estimation  techniques  can  be  used  for  designing  optimal  predictors. 
In  our  case,  however,  the  system  model  is  not  completely  specified.  The  problem  is  compounded  by 
the  fact  that  the  video  image  sequence  is  characterized  by  time-varying  parameters  rather  than 
stationary  ones.  We  approach  this  problem  in  the  fixed  predictor  case  by  exploiting  the  inherent 
correlation  of  the  adjacent  pixels  of  a  frame,  and  that  of  adjacent  frames  to  derive  the  state  vector 
model  by  assuming  a  fixed  interframe  and  intraframe  correlation.  In  the  case  of  an  adaptive 
predictor,  we  dense  the  interframe  correlation  from  the  set  of  frames  received  and  assume  that  in 
the  case  of  slew  dynamics,  which  is  typically  the  case  in  spacecraft-to-spacecraft  docking  situation, 
the  same  interframe  correlation  can  be  applied  to  the  next  set  of  frames.  We  intend  to  investigate 
if  it  :s  possible  to  obtain  better  results  using  tnese  approaches  than  with  simple  frame  refresh. 


Chapter  2  gives  the  formulation  of  the  image  processing  problem. 

A  truly  adaptive  predictor  would  have  to  take  into  account  and  compensate  for  the  relative 
displacement  between  successive  frames  of  a  sequence.  This  is  because  the  scene  is  actually 
nonstationary  as  the  camera  is  moving  with  the  spacecraft  and  taking  pictures  of  a  target  which 
may  also  be  in  motion.  Hence,  to  gain  an  added  insight  into  the  problem,  we  explore  the  application 
of  the  pattern  recognition  theory  for  estimating  object  motion  parameters  based  on  a  sequence  of 
images.  Dynamic  scene  analysis  is  receiving  increasing  attention  from  researchers  in  image 
processing  and  pattern  recognition.  Three-dimensional  projection,  optical  flow  and  trajectory 
determination  are  the  common  approaches  for  determining  object  motion  from  a  video  sequence.  A 
brief  description  of  these  is  given  below,  followed  by  a  description  of  the  approach  that  is  used  in 
this  work. 

Three-dimensional  projection  techniques  entail  an  inverse  projection  of  the  2-dimensional  (2- 
D)  image  frame  onto  a  3-D  space.  This  approach  makes  use  of  the  fact  that  all  motion  is  in  fact  3- 
D  and  consists  of  both  translational  and  rotational  components.  A  frame  is  a  2-D  representation  of 
the  3-D  scene  and  may  lead  to  ambiguity  about  the  real  scene  since  many  different  scenes  could 
produce  the  same  2-D  image.  In  other  to  get  a  correct  depth  model,  one  must  consider  the  third 
dimension  and  estimate  the  translational  and  rotational  motion  parameters  from  the  sequence  of 
2-D  video  frames.  Roach  and  Aggarwal  [5]  describe  such  a  technique  for  determining  the 
movement  of  the  objects  from  a  sequence  of  images.  Another  well-known  technique  is  that  of 
optical  flow.  Optical  flow  methods  represent  motion  in  the  image  plane  as  sampled,  continuous 
velocity  fields.  These  are  considered  to  be  a  powerful  tool  for  dynamic  scene  analysis  because  they 
contain  important  information  about  the  depth,  structure  and  motion  of  objects.  However,  the 
techniques  for  determining  optical  flow  are  known  to  be  computationally  very  complex.  One 
approach  for  the  computation  of  the  optical  flows  is  given  by  Jain  in  [6],  Another  wav  of 
recovering  3-D  parameters  is  based  on  trajectory  determination  for  certain  key  points  in  the  images. 
Trajectory-based  methods  rely  on  the  recognition  of  the  same  set  of  feature  points  in  two  or  more 


successive  frames,  and  then  utilize  the  correspondence  between  them  to  extract  motion  information. 
Such  methods  have  attracted  a  lot  of  attention  because  they  are  simpler  to  compute  than  optical 
flow  methods.  Sethi  and  Jain  [7]  present  an  algorithm  for  determining  trajectories  of  certain 
feature  points  from  a  sequence  of  images  and  use  it  to  extract  knowledge  about  the  third 
dimension. 

It  is  recognized  that  some  sort  of  movement  compensation  must  be  accomplished  in  order  to 
make  the  frame  recovery  more  meaningful.  Since  the  camera  is  not  stationary  but  moves  with  the 
parent  spacecraft  as  it  approaches  the  target,  the  translation  and  the  rotation  of  the  target  with 
respect  to  the  camera  can  be  significant,  and  must  be  estimated  and  accounted  for  in  our  prediction 
process.  The  accuracy  of  the  prediction  process  is  degraded  by  that  portion  of  the  picture  area 
which  can  be  classified  as  being  nonstationary.  Movement-compensated  interframe  prediction 
offers  a  promising  approach  to  improving  the  accuracy  of  prediction  by  estimating  and 
compensating  for  the  nonstationary  part  of  the  image.  In  our  specific  application  of  target  tracking, 
we  are  not  so  concerned  with  determining  the  shape  of  the  object  as  with  its  relative  orientation  to 
the  parent  spacecraft.  This  is  because  the  target  shape  would  almost  always  be  known  to  us  a 
priori.  Shape  determination  has  different  applications  such  as  in  computerized  tomography  where  a 
physician  may  be  interested  in  determining  the  shape  and  location  of  a  tumor  in  a  patient.  For  this 
reason,  we  apply  movement  compensation  algorithms  to  improving  the  accuracy  of  prediction 
instead  of  the  3-D  projection  techniques,  which  usually  involve  solving  of  a  complex  set  of 
nonlinear  equations.  In  Chapter  5,  we  describe  one  of  the  techniques  reported  in  the  literature  that 
we  applied  for  the  extraction  of  motion  data  and  show  that  it  is  possible  to  improve  the  accuracy  of 
prediction  by  compensating  for  motion. 

We  demonstrate  the  performance  cf  the  algorithms  via  a  subjective  evaluation  of  the 
reconstructed  frames  and  also  via  some  standard  objective  measures  of  performance  such  as  mean- 
square  error,  mean-  absolute  error  and  signal-to-noise  ratio.  These  performance  measures  and  the 
motivation  to  use  those  rather  than  a  visual  evaluation  alone  are  described  in  detail  in  Chapter  2. 


To  demonstrate  the  performance  of  the  algorithms,  we  use  a  set  of  8  frames.  The  frames  are  from 
a  video  tape  of  a  spacecraft-to-spacecraft  docking  simulation,  and  represent  the  kind  of  data  a 
remote  teleoperator  may  have  to  work  with.  A  typical  video  tape  would  have  a  frame  rate  of  30 
frames  per  second.  We  deliberately  selected  frames  which  were  not  consecutive,  but  skipped  over 
several  frames  in  between.  This  is  done  so  as  to  create  a  situation  which  is  much  worse  than  what 
a  teleoperator  would  normally  encounter  and  thereby  obtain  very  conservative  results.  These  data 
are  considered  relevant  because  the  main  motivation  for  the  work  is  to  aid  in  remote  piloting 
operation.  Each  frame  of  the  image  data  consists  of  512  rows  of  picture  elements  (pixels)  with  512 
pixels  in  each  row.  The  data  are  digitized  with  8  bits  per  pixel,  which  is  equivalent  to  a 
representation  of  the  image  pixel  intensity  on  a  scale  of  0  to  255,  with  255  representing  the 
brightest  intensity.  The  digitization  results  in  a  512x512  array  of  integers.  We  use  these  digitized 
frames  for  evaluating  the  performance  of  a  fixed  predictor  versus  an  adaptive  predictor.  The 
digitized  data  are  processed  on  a  VAX  11/750.  In  order  to  better  draw  any  conclusions  from  the 
study,  we  increased  the  number  of  cases  by  4  by  considering  256x256  pixel  subimages. 
Furthermore,  in  order  to  ease  the  computational  burden  imposed  by  the  size  of  the  matrices, 
especially  the  computation  of  matrix  inverses,  we  process  the  images  in  32x32  pixel  subblocks.  For 
obtaining  the  full  frame,  all  the  subblocks  are  pasted  together  in  the  proper  order. 

For  a  subjective  evaluation  of  the  frame  estimates  and  comparison  with  the  original  frames, 
we  use  the  COMTAL  Vision  One/20  system.  It  is  a  complete  image  processing  system  consisting  of 
a  fully  integrated  LSI- 1 1  processor,  image  processing  electronics  and  application  firmware  for  image 
display.  The  system  allows  digitization  of  images,  as  well  as  display  of  digitized  data,  i.e.,  analog- 
to-analog  (A  D)  and  digital-to-analog  (D/A)  conversion.  The  digitized  frame  data  were  provided 
in  the  UNIX  TAR  (Tape  Archive)  format  which  is  not  compatible  with  the  VMS  operating  system 
of  the  VAX  11/750  processor.  In  order  to  convert  the  raw  binary  data  into  real  number  matrices 
for  processing  on  the  VAX  processor,  we  first  arrange  the  raw  data  into  block  structured  files 
consisting  of  512  byte  blocks.  Subsequently,  we  use  a  set  of  standard  tape  utilities  and  also  some 
special  programs,  to  read  the  binary  files  and  convert  them  into  real  number  matrices  for  algorithm 
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processing.  The  frame  estimates  obtained  via  the  algorithms  are  likewise  converted  into  block- 
structured  binary  files  and  transported  to  the  COMTAL  machine  for  D/A  conversion.  Chapter  6 
gives  a  discussion  of  the  simulation  results.  We  do  the  processing  in  software  to  investigate  the 
feasibility  of  the  approach.  However,  in  practice,  the  processing  would  be  done  in  hardware  with 
the  use  of  high-speed  adders  and  multipliers.  Since  this  processing  would  be  done  at  a  ground 
control  station  instead  of  on  board  a  spacecraft,  we  are  not  so  constrained  by  the  weight,  power, 
and  size  of  the  processing  equipment. 

The  organization  of  the  thesis  is  as  follows.  Chapter  2  presents  a  formulation  of  the  image 
processing  problem  and  describes  a  set  of  objective  performance  criteria  used  for  performance 
evaluation  of  the  approaches.  Chapter  3  presents  the  fixed  predictor  approach.  The  prediction  is 
applied  to  both  single  and  multiple  steps  of  prediction  using  both  a  single  frame  and  multiple 
frames.  Performance  of  the  algorithm  is  evaluated  using  the  performance  criteria  outlined  in 
Chapter  2.  Chapter  4  presents  the  application  of  the  adaptive  predictor  technique  to  the  image 
processing  problem.  We  also  describe  the  peculiarities  of  the  image  processing  problem  and  the 
resulting  mathematical  complexity  involving  matrix  manipulations.  The  results  are  presented  for  a 
suboptimal  adaptive  predictor.  Chapter  5  presents  the  approach  used  for  displacement 
measurement  between  consecutive  frames  and  show  the  effect  on  improving  the  accuracy  of  frame 
estimates.  Chapter  6  presents  pictures  of  the  frame  estimates  derived  via  the  various  approaches 
along  with  the  original  pictures  and  discuss  the  results.  Chapter  7  presents  the  conclusions  of  the 
work.  Based  upon  our  findings,  some  areas  for  future  research  are  suggested. 
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CHAPTER  2 

IMAGE  PROCESSING  PROBLEM  FORMULATION 


2.1.  Introduction 

In  this  chapter,  we  give  a  mathematical  representation  of  the  image  processing  problem  stated 
in  Chapter  1.  For  solving  the  prediction  problem,  we  model  the  image  processing  problem  as  a 
discrete-time  linear  state  vector  model.  This  model  is  used  subsequently  in  Chapters  3  and  4  to 
derive  fixed  and  adaptive  estimates  of  video  frames.  For  modeling  the  image  sequence,  we 
represent  a  sequence  of  video  frames  as  a  discrete-time  nonstationary  process  with  each  individual 
frame  being  represented  as  an  N-dimensional  vector,  f(k).  Since  the  frames  are  2-dimensional  and 
have  Nj  rows  of  N2  pixels  each,  as  shown  in  Figure  2,  therefore,  f(k)  can  be  considered  to  be  a 
column  vector  of  (NjXNj)  elements. 

As  explained  in  Chapter  1,  in  order  to  ease  the  computational  complexity,  we  process  32x32 
pixel  blocks  of  the  image  instead  of  the  entire  256x256  pixel  subimages.  Also,  we  take  advantage 
of  the  adjacent-pixel  correlation,  both  within  a  frame  and  between  successive  frames,  to  define  the 
system  matrices  by  representing  f(k)  as  a  32x32  matrix  instead  of  a  1024x1  vector.  Thus,  we  can 
make  use  of  matrix  manipulation  algorithms  to  derive  the  fr;>me  estimates.  This  statistical  model 
assumes  that  each  32x32  block  of  pixels  represents  a  two-dimensional  separable  wide-sense 
stationary  process.  In  reality,  however,  the  pixels  of  a  block  are  dependent  on  the  pixels  in  the 
neighboring  blocks.  This  is  due  to  the  inherent  non-separability  of  the  images,  and  the  resulting 
modeling  error  is  seen  as  blockiness  in  the  reconstructed  images  where  smooth  lines  are  expected. 
This  can  be  seen  in  the  simulation  pictures  presented  in  Chapter  6.  In  applications  where  such 
degradation  is  intolerable  such  as  medical  applications  of  computerized  tomography,  there  are 
approaches  for  overcoming  the  border  effect.  One  approach  involves  using  overlapping  subblocks 
and  subsequently  discarding  the  borders.  In  our  application,  however,  this  minor  degradation  in 
the  reconstructed  picture  quality  is  not  a  problem.  This  is  because  the  main  aim  in  remote  piloting 
is  not  a  determination  of  the  shape  of  the  object,  but  its  relative  orientation  to  the  parent 


spacecraft  Also,  the  shape  of  the  object  is  almost  always  known  a  priori.  For  these  reasons, 
processing  of  the  images  in  32x32  pixel  subblocks  is  considered  a  good  compromise  between  picture 
quality  and  computational  complexity. 

What  follows  is  a  mathematical  model  of  the  image  processing  problem. 

2.2.  Equation  Formulation 

The  source  is  a  sequence  of  \-dimensional  vectors  { f (k ) } . 

f(k)  =  (fx(k).fa(k) . fN(k))T  .  (1) 

where  N  is  the  number  of  elements  in  vector  f(k).  and  k  is  the  frame  number  in  the  sequence. 

For  the  image  processing  application,  each  vector  f(k)  represents  a  video  frame  with  N 
number  of  pixels.  For  a  frame  containing  N\  rows  and  N2  columns. 

N  =  IV  j  x  N2  . 

The  scene  dynamics  are  modeled  as  a  state  vector  model  where  a  frame  is  represented  as  a  state 
vector.  The  structure  of  the  first-order  model  is 

f(k+l)  =  Af(k)  + W(K)  .  (2) 

where  the  suffixes  k  and  k+1  are  discrete  time  instants.  Matrix  A  represents  system  parameters, 
and  \V(k).  the  system  modeling  error. 

In  order  to  improve  the  accuracy  of  prediction,  it  is  often  helpful  to  increase  the  order  of  the 
model  sc  as  to  whiten  the  modeling  error,  W.  This  is  equivalent  to  estimating  f(k+l)  based  upon 

not  just  f(k),  but  also  f(k-l ).f(k-2) . f(k-M).  In  this  case,  the  system  model  has  the  following 

structure 

f(k+l)  =  Ajf(k)  +  A2f(k-1)  +  •  ■  •  +  AM+1f(k— M)  +  W(k)  .  (3) 

Equation  (3)  is  readily  recognized  as  a  higher -order  state  model  where  now  the  state  vector  is 
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[f(k)T  f(k — 1  )T  •  •  •  f(k— M)T]T  .  (4) 

The  observed  data  is  corrupted  by  channel  noise  representing  the  digital  data  transmission  error 
associated  with  the  satellite  link.  The  general  system  model  for  a  first-order  model  is  as  follows: 

f(k+l)  =  ACk)f(k)  +  W(k) :  f(k)  6  RN'  (5) 

where  A  (k)  is  the  system  matrix. 

f(k)  is  the  .\-dimensional  state  vector. 

W(k)  is  the  modeling  error  between  the  actual  value  f(k+l)  and  the 
predicted  value  ftilde(k+l/k). 

f(k  +  l/k)  is  the  estimate  at  time  k+1  knowing  measurements  at  time 
instants  up  to  and  including  k. 

This  state  vector  model  is  similar  to  the  model  used  in  Kalman  filter  application.  However, 
here  we  are  not  using  any  observations  for  updating  the  estimates  at  each  instant  as  the  new  data 
becomes  available.  Instead,  we  consider  a  subset  of  the  available  frames  to  predict  the  next  few 
frames  using  the  state  vector  equation  only,  and  show  that  as  more  frames  are  received,  we  can 
derive  a  better  estimate  due  to  a  decrease  in  the  number  of  steps  of  prediction  required. 

A  prediction  of  the  vector  f~(k+l)  is  formed  based  on  the  past  statistics,  f(k),f(k-l)....,f(k- 
M).  using  either  the  fixed  or  the  adaptive  predictor  described  in  Chapters  3  and  4. 

The  prediction.  f~(k+l).  is  obtained  from 
1  f(k) 

(one-step  prediction.  f(k+l)/f(k)),  or 

2.  f(k)  and  f(k-l ) 

(two-step  prediction,  (f~(k+l )/f(k),  f(k-l)).  or 

3.  f(k),  f(k-l).  f(k-2)....  and  f(k-M) 

(multi-step  prediction.  (f~(k+l )/f(k).  f (k- 1 ) f(k-\l)). 
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The  mode!  presented  above  is  used  in  Chapters  3  and  4  to  derive  frame  estimates  via  fixed  and 
adaptive  predictors.  The  estimate,  f(k+l),  is  compared  with  the  actual  frame  f(k-i-l)  to  obtain  both 
a  subjective  and  an  objective  evaluation.  For  the  subjective  evaluation,  the  frame  data  is  converted 
from  digital-to-analog  format  as  described  in  Chapter  1  and  is  displayed  on  a  computer  monitor. 
The  performance  is  also  analyzed  using  a  set  of  objective  measures  of  performance  which  are 
described  in  detail  in  the  following  section. 

2.3.  Performance  Criteria 

Having  modeled  our  system,  our  next  objective  is  to  outline  a  set  of  criteria  for  analyzing  and 
comparing  the  performance  of  various  techniques  for  image  data  prediction.  In  this  section,  we 
discuss  criteria  for  an  objective  evaluation  of  the  reproduced  images.  It  is  recognized  that  the 
visual  fidelity  assessment  of  reconstructed  video  images  is  based  upon  a  subjective  evaluation  of 
the  images.  This  is  because  the  ultimate  user  of  these  images  is  man.  Seyler  [8]  describes  visual 
communications  and  the  psychophysics  of  human  vision  and  suggests  that  the  objective  of 
television  is  to  produce  "as  accurately  as  practicable  a  realistic  replica  of  the  natural  environment 
shown,  i.e..  (to  create)  in  the  viewer's  mind  the  illusion  of  direct  communication."  We  assume  that 
the  television  cameras  employed  on  board  the  spacecraft  conform  to  an  accepted  standard  and 
regard  the  original  video  frames  as  an  accurate  replica  of  the  real  scene.  Our  objective  then  is  to 
reproduce  those  images  with  as  little  distortion  as  possible.  Since  the  ultimate  destination  of  these 
images  is  a  man-in-the-loop,  the  most  important  criterion  is  his  accurate  perception  of  the  scene. 

However,  for  the  purpose  of  designing  communications  systems  and  for  comparing 
performances  of  alternative  systems  and  designs,  one  also  requires  an  explicit  evaluation  of  the 
reproduced  images.  It  is  widely  recognized  that  to  mathematically  model  man’s  sense  of  vision, 
including  luminance  and  chrominance  vision,  is  a  very  complex  problem.  It  is  an  accepted  practice 
to  employ  measurements  which  are  analytically  more  tractable  then  the  mathematical  models  of 
human  vision  and  have  criterion  values.  This  applies  both  to  analog  and  digital  transmissions,  and 
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monochrome  and  color  images.  In  this  work,  however,  we  are  concerned  with  digital  transmission 
and  monochrome  images  only. 

In  employing  objective  measures  of  performance  for  assessing  visual  fidelity  of  reconstructed 
images,  it  is  implied  that  video  distortion  is  identifiable  with  errors  in  reproduction  and  will  result 
in  a  poor  performance  with  respect  to  the  objective  criteria  employed.  Otherwise,  the  criteria 
would  have  only  a  limited  value  or  none  at  all.  A  number  of  papers  on  image  processing  have 
addressed  this  issue  and  sought  to  find  a  numerically-valued  measure  of  distortion  which  has  a 
reasonable  correspondence  with  the  subjective  evaluation  by  a  human  interpreter.  Hall.  Budrikis. 
and  Mannos  [9.10,11]  address  this  problem  and  suggest  some  alternatives. 

In  this  work,  since  we  also  propose  to  use  subjective  evaluation  for  the  reproduced  images,  we 
have  restricted  ourselves  to  commonly  used  objective  measures  of  performance  which  are  described 
below. 


2.3.1.  Standard  image  quality  measures 

Some  commonly  used  image  quality  measures  are  defined  below. 

1.  Mean-Square  Error 

One  of  the  most  commonly  used  quality  or  distortion  measure  is  mean-square  error  (MSE). 
MSE  is  defined  as 

MSE  =  E  [f(k)  —  f(k)]2  , 

where  F(k)  is  the  estimate  of  the  frame  and  f(k)  .  the  actual  frame  . 


For  an  NxN  discrete  image,  MSE  may  be  defined  as 


1 

MSE  =  - 

NTX.\ 
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[f(i.j)  —  f(i.j)]2  . 


(6) 


This  measure  is  attractive  because  it  is  analytically  tractable.  Its  limitation  is  that  on  certain  kinds 
of  images,  it  does  not  correspond  very  closely  with  human  evaluation. 

2.  Normalized  Mean-Square  Error 
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One  can  also  define  an  image  quality  measure  based  on  MSE  and  energy  normalization.  It  is 


called  normalized  mean  square  error  (NMSE).  and  it  is  defined  as 


MSE  between  the  original  and  reconstructed  frame 


The  peak-to-peak  signal-to-noise  ratio  (SNR)  is  defined  as 
(peak— to— peak  signal  value)2 

SNR  =  10  log  — - - - - -  ,  (9) 

MSE 

where  peak-to-peak  signal  value  is  255  for  an  image  quantized  with  S  bits.  The  SNR  is  a 
commonly  accepted  image  quality  measure  and  has  a  reasonably  good  correlation  with  the 
distortion.  :n  the  reproduced  image. 

The  results  in  Chapters  3.  4.  and  5  are  tabulated  according  to  these  criteria  for  various  cases. 
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CHAPTER  3 

APPLICATION  OF  A  FIXED  PREDICTOR  FOR  IMAGE  DATA  PREDICTION 

3.1.  Introduction 

In  Chapter  3,  we  present  a  mathematical  model  that  could  be  used  to  describe  the  dynamical 
behavior  of  the  image  processing  problem.  The  problem  with  modeling  the  scene  dynamics  as  a 
state  vector  model  is  a  complex  one  since  the  system  matrix.  A,  of  Equation  (5)  is  highly  scene- 
dependent,  and  also  depends  on  how  rapidly  the  object  is  moving.  In  an  application  such  as  video 
image  processing,  where  the  system  and  noise  models  are  either  ill-defined  or  not  completely 
specified,  it  is  feasible  to  estimate  a  model  using  certain  properties  which  are  peculiar  to  video 
images.  For  developing  a  fixed  predictor,  we  derive  a  state  vector  model  by  exploiting  the  high 
level  of  adjacent-pixel  correlation  inherent  in  video  images.  This  is  true  both  of  adjacent  pixels  of  a 
frame  as  well  as  corresponding  pixels  of  adjacent  frames.  We  call  this  a  fixed  predictor  because  we 
use  a  fixed  system  matrix,  A.  This  predictor  is  then  applied  to  the  most  recent  set  of  video  frames 
received  for  the  purpose  of  estimating  the  next  few  frames  of  the  sequence. 

The  advantages  of  using  this  predictor  for  image  data  prediction  are  as  follows: 

(1)  To  carry  out  on-line  prediction  of  image  data  using  frames  as  they  become  available  instead  of 
doing  frame  refresh,  which  depends  only  on  the  last  frame  received.  Here,  we  utilize  more 
data  to  try  to  derive  a  more  accurate  estimate. 

(2)  To  incorporate  modeling  error  as  system  noise  to  improve  the  accuracy  of  the  prediction  even 
more.  This  is  not  feasible  with  frame  refresh  alone. 

(3)  To  derive  interframe  motion  using  a  set  of  the  last  frames  received  via  a  determination  of 
pixel  trajectories  or  other  techniques  and  use  the  displaced  frames  instead  of  the  actual  ones 
for  the  prediction.  Thus,  we  can  do  motion  compensation  which  the  technique  of  frame 
refresh  does  not  allow. 
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In  Chapter  6.  we  present  the  pictures  of  the  frame  estimates  derived  via  various  combinations 
of  available  frames  for  both  single  as  w’ell  as  multiple  steps  of  prediction.  The  performance 
according  to  the  criteria  outlined  in  Chapter  2  is  summarized  in  Tables  1-5.  The  results  obtained 
with  the  use  of  frame  refresh  only  are  summarized  in  Tables  6  and  7  for  a  comparison.  It  is  shown 
that  the  use  of  the  fixed  predictor  provides  better  criterion  values  (MSE  and  SXR)  than  the  frame 
refresh.  However,  an  evaluation  of  both  the  objective  and  the  subjective  merit  of  the  reconstructed 
frames  seems  to  indicate  that  there  is  not  a  tremendous  improvement  over  frame  refresh. 

3.2.  Equation  Formulation 

We  use  a  sequence  of  frames  (f(k)},  where  each  f(x)  is  an  NjXN2-  dimensional  matrix.  As 
explained  in  Chapter  2.  f(k)  is  considered  to  be  a  32x32  matrix.  The  structure  of  the  first-order 
model  is  as  given  in  Chapter  2.  Equations  (2)-(5).  The  unmodeled  dynamics  are  accounted  for  by  a 
model  error  term.  Just  as  a  reference,  the  general  system  equation  for  a  first-order  model  is  as 
follows: 


f(k+l)  =  A(k)  f(k)  +  W(k)  ;  f(k)  6  R' 


(10) 


where  A(k)  is  the  system  matrix 

f(k)  is  the  N-dimensional  (N=NjXN2)  state  vector, 

W(k)  is  the  modeling  error  between  the  actual  value  f(k+l)  and  the  predicted  value 
?(k  +  1,'k ), 

f (k-t-  1/k)  is  the  estimate  of  f(k+l)  knowing  measurements  at  time 
instants  up  to  and  including  k,  and 
the  suffixes,  k  and  k  +  1,  are  discrete  time  instants. 

Assumption  The  statistical  properties  of  W(k)  are  assumed  to  be  zero-mean  white  Gaussian  noise 
with  the  covariance  given  by 

Ei  W(k)  W(1)TI  =  Q(k)8kl 


Table  1.  Fixed  Predictor,  Next-Frame  Prediction 
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Table  2.  Fixed  Predictor.  Two-Frame  Ahead  Prediction 

f(K+2)  =  <X>(K+2.K)*f(K) 


DESCRIPTION 
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2 

2.65 

3 

5.82 

4 

r 

2.66 

2! 


Table  3.  Fixed  Predictor  Three-Frame  Ahead  Prediction 


f(K  +  3)  =  <t>(K  +  3.k>f(K) 


DESCRIPTION 

BLK  * 

VIA  BSE  . 

VISE 

l  9c  XMSE 

|  SNR  (dB.I  OVERALL  LM- 

1 

1 

1  AGE  SNR  (dB)  : 
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IMAGE  8 
FROM 

IMAGE  7  & 
IMAGE  6 


Table  4.  Fixed  Predictor,  Next-Frame  Prediction 
Using  a  Last  Two  Frames  Received 

f(K+l)  =  A1*f(K)+A2*f(K— 1) 


DESCRIPTION  BLK  #  MABSE 


%  NMSE  SNR  (dB) 


OVERALL  IM¬ 
AGE  SNR  (dB) 


\  Vl'i 


iV 
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$ 


c>J 


V 

V  -- 


$ 
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Table  5.  Fixed  Predictor,  Next-Frame  Prediction 
Using  Last  Three  Frames  Received 

f(K+l)  =  A1*f(K)+A2*f(K— l)+A3»f(K— 2) 


DESCRIPTION  BLK  # 
-  - 

IMAGE  4  - 

2 

FROM  - 

3 

IMAGE  3  &  4 

IMAGE  2  & 

IMAGE  1 


MABSE 

4.63 

2.20 


IMAGE  5 

FROM 

IMAGE  4 

& 

IMAGE  3 
IMAGE  2 

& 

MSE 

%  NMSE 

SNR  (dB) 

38.8 

0.34 

32.2 

9.79 

0.09 

38.2 

66.7 

0.55 

29.9 

10.9 

0.10 

37.7 

OVERALL  IM¬ 
AGE  SNR  (dB) 


IMAGE  6 
FROM 

IMAGE  5  & 
IMAGE  4  & 
IMAGE  3 


8.8  | 


60.0  0.52 


Table  6.  Next-Frame  Prediction  by  Frame  Refresh 


DESCRIPTION 

BLK  # 

MABSE 

.MSE 

%  N'MSE 

SNR  (dB) 

OVERALL  IM¬ 
AGE  SNR  (dB) 

IMAGE  2 

FROM 

IMAGE  1 

1 

3.37 

22.5 

0.19 

34.9 

36.2 

2 

2.14 

8.8 

0.08 

38.7 

3 

3.35 

22.6 

0.19 

34.6 

4 

2.16 

8.94 

0.08 

38.6 

IMAGE  3 

FROM 

IMAGE  2 

1 

5.07 

44.2 

0.37 

31.7 

33.2 

2 

2.72 

18.2 

0.17 

35.5 

3 

5.04 

44.8 

0.38 

31.6 

4 

2.76 

16.8 

0.16 

35.9 

IMAGE  4 

FROM 

IMAGE  3 

1 

5.73 

59.1 

0.51 

30.4 

32.3 

2 

2.70 

15.7 

0.15 

36.2 

3 

5. 87 

62.4 

0.54 

30.8 

4 

2.76 

16.6 

0.16 

35.9 

IMAGE  5 

FROM 

IMAGE  4 

1 

6.07 

61.9 

0.53 

30.2 

i 

32.1 

2 

2.90 

17.8 

0.17 

35.6 

3 

6.22 

65.1 

0.56 

30.0 

4 

2.9416.3 

0.15 

36.0 

IMAGE  6 

FROM 

IMAGE  5 

1 

9.23 

171.0 

1.49 

25.8 

25.0 

2 

9.42 

226.0 

1.80 

24.6 

3 

9.75 

189.0 

1.65 

25.4 

4 

10.0 

1.96 

24.2 

IMAGE  7 

FROM 

IMAGE  6 

1 

6.09 

!  76-2 

0.66 

29.3 

27.8 

2 

6.85  132.0  1.00 

26.9 

3 

^  1 

6.10  78.4  0.68 

29.2 

4 

7  49  147.0  I  1.10 

26.5 

IMAGE  8 

FROM 

1 

7.01  94.2 

0.77 

28.4 

2 

904  206.0 

1.49 

UBS 

24.5 


IMAGE  7 


_3 

4 


6.95  95.0 

9.92  i  233.0 


0.78 

1.64 
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DESCRIPTION 


IMAGE  3 
FROM 
IMAGE  1 


Table  7.  Frame  Refresh,  Two-Frame  Ahead  Prediction 
f(K+l)  =  A1*f(K)+A2*f(K— 1  )+A3*f(K— 2) 


BLK  #  MABSE  MSE  %  NMSE  SNR  (dB)  OVERALL  IM¬ 
AGE  SNR  (dB) 


0.38 


0.15 
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where  Q(k)  is  a  positive  semi-definite  matrix  and  8kJ  is  the  Kronecker  delta 

ll  for  k  =  l 

10  for  k  =  l  ' 

Algorithm  for  fixed  prediction  in  the  absence  of  modeling  error. 

(1)  Relation  between  prediction  f(k+l)  and  f(k), 

f(k+l/k)  =  A(k)f(k).  (11) 

(2)  For  multiple  steps  of  prediction,  the  relation  between  f(k+M/k)  and  f(k)  is  given  by 

f(k+M/k)  =  <J>(k+M.k)  f(k)  (12) 

where  $  is  the  state  transition  matrix. 

<J>(m.m)  =  I 

(13) 

4>(m.n)  =  A(m— 1)  .  A(m— 2)  •  ■  •  A(n) 

A  generalized  state  model  is 

f(k)  =  4>(k,k— M)  f(k-M)  +  W(k)  .  (14) 

It  is  seen  that  the  fixed  predictor  resembles  the  standard  estimation  technique  of  Kalman  filter 
with  one  exception.  Here,  we  are  not  using  a  measurement  vector  to  update  the  parameters 
between  samples.  However,  the  fixed  predictor  allows  us  to  update  the  state  vector  by  using 
the  most  recent  set  of  frames,  and  thus  provides  a  better  estimate  as  the  number  of  steps  of 
prediction  is  lowered. 

33.  Performance  Evaluation 

In  order  to  find  a  suitable  system  matrix  A,  we  used  the  fact  that  video  images  are 
characterized  by  very  high  pixel  correlation,  both  in  the  space  domain  and  the  time  domain,  i.e.,  in 
the  temporal  direction.  We  assume  that  the  pixels  are  zero-mean  samples  of  a  separable  Markov 
process.  Then,  by  assuming  a  fixed  adjacent-pixel  correlation,  the  structure  of  the  A  matrix  is  as 
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shown  in  Figure  3.  This  correlation  structure  is  used  to  describe  both  spatial  and  temporal 
correlation,  i  e.,  between  adjacent  pixels  of  a  frame  as  well  as  between  corresponding  pixels  of 
adjacent  frames.  We  assume  a  fixed  pixel  correlation  of  0.96  for  both  intraframe  and  interframe 
correlation.  This  results  in  an  A  matrix  town  in  Appendix  B  for  a  first-  order  system  for  one 
frame-ahead  prediction  using  the  last  frame  received.  Other  matrices  are  selected  likewise  to 
increase  the  system  order  for  increasing  the  accuracy  of  prediction. 

Tables  1-5  show  the  results  of  applying  this  algorithm  to  obtain  frame  estimates  for  the  next 
frame  (Table  l).  two-frame  aherd  prediction  (Table  2).  three-frame  ahead  prediction  (Table  3). 
next-frame  prediction  using  the  last  two  frames  received  (Table  4).  and  using  the  last  three  frames 
(Table  5).  The  results  of  the  frame  refresh  technique  applied  to  the  next-frame  prediction  are 
summarized  in  Tables  6  and  7  for  a  prediction  based  on  the  last  frame  received,  and  the  last  two 
frames,  respectively.  These  are  presented  for  a  contrast  with  the  fixed  predictor  results.  It  is  seen 
from  these  tables  that  the  fixed  predictor  provides  better  criterion  values  than  frame  refresh  in 
almost  all  the  cases.  The  results  in  Tabies  1-5  were  obtained  without  compensating  for  motion 
compensation.  It  is  shown  in  Chapter  5  that  it  is  possible  to  improve  upon  the  performance  of  the 
predictor  by  estimating  and  compensating  for  the  interframe  displacements.  The  pictures  of  the 
frame  estimates  are  presented  in  Chapter  6.  From  a  subjective  evaluation  of  the  reconstructed 
frames,  it  appears  that  there  is  not  a  tremendous  advantage  to  using  a  fixed  predictor  over  a  simple 
frame  refresh  in  spite  of  what  the  criterion  values  indicate.  All  these  results  and  a  subjective 
evaluation  of  the  reconstructed  frames  are  discussed  in  detail  in  Chapter  6. 


CHAPTER  4 


APPLICATION  OF  AN  ADAPTIVE  PREDICTOR  FOR  IMAGE  DATA  PREDICTION 

4.1.  Introduction 

In  the  case  of  a  fixed  predictor,  we  assume  a  certain  model  for  describing  the  interframe 
relationship  of  a  video  sequence.  That  model  is  then  used  for  prediction  of  frames  using  a  set  of 
past  frames.  In  the  case  of  an  adaptive  predictor,  we  estimate  the  state  vector  model  using  the 
interframe  correlation  of  the  available  frames,  and  by  making  an  assumption  that  the  model  can  be 
approximated  as  a  wide-sense  stationary  random  process.  In  an  application  such  as  remotely 
piloted  spacecraft,  it  appears  to  be  a  reasonable  assumption  especially  in  the  docking  mode  which  is 
characterized  by  very  slow  motion. 

The  approach  used  for  developing  the  adaptive  predictor  is  the  classical  parameter  estimation 
technique  of  generalized  least  squares.  We  seek  to  find  an  optimal  adaptive  predictor  which  would 
provide  a  ieast  mean-square  solution  to  the  prediction  problem,  i.e..  minimize  the  square  of  the 
error  between  the  original  and  the  reconstructed  frames.  It  is  shown  in  this  chapter  that  the  ill- 
conditioned  nature  of  the  image  processing  problem,  specifically  the  sample  covariances,  render  it 
computationally  a  very  complex  problem.  It  is  shown  that  when  we  represent  the  images  by  the 
intensity  salues  of  the  pixels,  the  resulting  matrices  are  almost  always  singular  or  nearly  singular. 
In  the  image  processing  problem,  the  image  degradation  is  represented  as  a  transformation.  Hence, 
lo  recover  the  original  image  from  a  degraded  one  or  to  reconstruct  images  often  requires 
computation  of  inverse  transformations,  which  is  mathematically  represented  as  a  mati  x  inversion 
problem  It  is  shown  that  such  matrices  have  zero  or  near-zero  eigenvalues  and  thus  it  is  not 
p*  ssioie  tc  find  an  inverse.  We  show  that  the  well-known  technique  cf  adding  a  disturbance  along 
d.agt.oal  to  stabilize  singular  matrices  does  not  overcome  the  problem  of  the  singularity  of  these 
mat: ..es  We  'hen  explore  the  use  of  Singular  Value  Decomposition  (SVD)  in  order  u  isolate  and 
'ii>v.a-d  toe  near  zero  singular  values  and  attempt  to  find  an  inverse  by  effectively  reducing  the 
•  ;  -.x  >t  'h*  matrix.  It  is  shown  that  the  large  variation  in  the  singular  values  effectively  amplifies 


the  noise  and  makes  the  recovered  image  unacceptable  in  terms  of  visual  fidelity.  Because  of  the 
mathematical  complexity  associated  with  the  optimal  adaptive  predictor,  we  use  a  suboptimal 
adaptive  predictor  which  still  uses  the  previous  samples  to  derive  the  frame  estimates.  The  intent 
is  to  evaluate  the  performance  of  this  suboptimal  predictor  relative  to  that  of  the  fixed  predictor 
and  frame  refresh. 

4.2.  Equation  Formulation 

As  mentioned  above,  the  algorithm  for  the  adaptive  preditor  is  based  on  the  standard 
technique  of  generalized  least  squares  parameter  estimation.  The  equations  summarized  below  can 
be  found  in  any  standard  textbook  or  reference  in  estimation  theory  and  in  [3].  A  prediction  of 
vector  f(k),  namely.  f(k).  is  formed  based  on  previously  reconstructed  vector  f“(k-l)  as  shown  in 
Figure  4. 

f(k)  =  B(k)  f(k-l)  .  (15) 

where  B(k)  is  a  time-varying  predictor  and  f(k)  is  the  kth  frame  of  the  sequence.  The  prediction 
can  also  be  based  upon  a  set  of  previously  reconstructed  vectors.  {f(k— l).f(k— 2) f(k— M)}.  For  a 


non-time-varying  predictor, 

f(k)  =  Bf(k-l)  .  (16) 

The  B  vector  can  also  be  chosen  to  be  a  diagonal  matrix, 

B  =  diagfbj.bj . bN)T  .  (17) 


in  which  case  each  element  of  the  estimated  vector  f^k)  depends  only  on  the  corresponding  element 
of  the  previous  reconstructed  vector,  f(k— l),  i.e.. 

f,(k)  =  b,Tf(k— 1)  . 

Using  M  previously  reconstructed  vectors. 

?(k)  =  B(k)  g(k)  ,  (18) 


where  g(k)  is  a  column  vector  consisting  of  the  M  previous  vectors 
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id 


X 
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g(k)T  =  (f(k-l)T  f(k— 2)T  •  ■  •  f(k— \1)T)  .  (19) 

Equation  (18)  is  a  generalization  of  Equation  (15)  using  multiple  steps  of  prediction. 

The  prediction  error  e(k)  is 

e(k)  =  f(k)  -  B(k)  g(k)  .  (20) 

In  the  matrix  form.  e(k)  can  be  expressed  as 

ft(k— 1) 
f2(k— 1) 

f^k)  bjT(k)  fN(k_i) 

e(io  =  f2<k)  -  blI(k)  .  on 

fN(k)  b^(k)  fj(k-M) 

f2(k— M) 

fN(k— M) 

Here,  we  are  assuming  that  the  quantization  error  is  negligible,  i.e.. 
f(k)  =  f(k)  . 

The  predictor  vector  B(k)  consists  of  N  vectors  b,(k)  corresponding  to  each  element  of  the  vector 
f(k),  i.e.. 

B(k)  =  (bj(k)  b2(k)  ■  •  bN(k))T  .  (22) 

where  each  b,(k)  consists  of  NxX  coefficients  for  the  i-th  element  of  f(k).  The  error  vector  e(k)  is 
given  by 

e,(k)  =  f,(k)  —  b,(k)T  g(k)  .  for  i  =  1.2 . N  .  (23) 

At  this  point,  we  select  the  standard  parameter  estimation  technique  of  least  mean  squares  (LMS) 
for  developing  an  optimal  predictor.  In  other  words,  an  optimal  predictor  is  one  that  minimizes  the 
reconstruction  error,  e^k).  Using  this  criterion,  the  optimal  estimate  would  be  the  LMS  estimate. 
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The  LMS  estimate  f(k)  is  obtained  by  choosing  the  predictor  matrix  B(K)  such  that  the  mean 
square  reconstruction  error  is  minimized,  i.e., 

MSE  =  E{e,(k)2}  is  minimized  for  all  i  =  1.2 . N  .  (24) 

The  error  covariance  matrix  Re(k)  is  defined  as 

R;(k)A=E{e(k)  e(k)T}  (25) 

Min  |e,(k)2}  =  MinjDiagonal  elements  of  Re(k)}  (26) 

=  Minftr  Rf(k)}  . 

The  optimal  predictor  B(K)  minimizes  the  trace  of  Re(k)  for  an  LMS  criterion. 

VB;ic){tr  Re(K)}  =  0  .  (27) 

Using  simple  matrix  manipulations  and  the  following  properties  of  a  trace  operator, 

tr  (A+B)  =  tr  A  +  tr  B  .  and 
tr  AT  =  tr  A  , 

tr{ Re( k) }  =  tr  E{e(k)  e(k)T)  (28) 

=  tr  E([f(k)-B(k)  g(k)][f(k)T-g(k)T  B(k)T:i 
=  tr(R(k,k)-B(k)Rgf(k)-Rgf(k)TB(k)T  +  Bfk)Rg(k)B(k)T} 

=  tr(R(k,k))  -  2  tr(B(k)Rff(k))  (29) 

+  tr(B(k)Rg(k)B(k)T)  , 

where  the  following  definitions  f:r  covariance  matrices  apply 

A 

R(k,.k2)  =E{f('k1)  f(k2)T}  (30) 

Rg(k)  =E|g(k)  g(k)1 ! 

A  T 

Rgf(k)  =E{g(k)  f(k)  )  . 

The  optimal  predictor  B(k),  namely,  B+  is  such  that  it  minimizes  tr{Rr(k)}.  i.e.,  from  Equations 
(29;  and  (18) 
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B+  =  B(k)  such  that 


VB(k)[tr  R(k)}  -  2  tr{ B(k)Rgf( k.) }  +  tr|B(k)Rg(k)B(k)Tl]  =  0 


Using  simple  matrix  manipulation  and  the  following  rules  of  matrix  gradient  erations. 


VHtr(BA)  =  A  and 


Vg  (BABt)  =  BAT  +  BA  .  for  the  optimal  predictor  . 


VBU)  tr  {Re(k)(  =  -2  Rgf(k)T  +  B(k)(Rg(k)  +  Rg(k)T)  =  0  . 


Since  R  (k)  is  symmetric,  assuming  that  its  inverse  exists,  the  LMS  predictor. 


B+(k)  =  Rgf(k)T  Rg(k)  1  . 


The  LMS  prediction  error, 


ef(k)  =  f(k)  —  Bf(k)  g(k) 


together  with  Rgf(k)  and  Rg(k).  can  be  approximated  using  the  covariance  of  the  original  data, 


E{f(k— l)f(k)T}  1  rR(k-l.k) 


RgfCk)  = 


E{f(k— 2)f(k)T}  R(k— 2,k) 


iE{f(k-M)f(k)T}  1  R(k— M.k) 


E{f(k— l)f(k— 1)T}  •  ■  •  E{f(k— l)f(k— M)t} 


Rg(k)  = 


lE{ f (k — M3f (k — 1  )T}  •  •  •  Eif(k— M)f(k— M)t} 


R(k-l.k-l)  •••  R(k-l.k-M) 


‘R(K-M.K-l)  •  •  •  R(K-M.K-M) 


For  a  single-step  prediction. 


Rgf(k)  =  R(k— l,k)  and 


v;* 


R?(k)  =  R(k-l.k-l) . 

In  the  following  section,  we  address  the  mathematical  problems  of  using  the  optimal  predictor 
derived  in  this  section  for  obtaining  the  frame  estimates.  The  equation  for  the  optimal  predictor. 
Equation  (34).  assumes  that  the  frame  to  be  estimated  is  available  for  deriving  the  autocorrelation 
between  previous  frames  represented  by  g(k)  and  f(k).  Hence,  in  practice,  one  must  somehow 
estimate  this  autocorrelation.  We  do  this  by  assuming  that  the  image  sequence  is  represented  by  a 
zero-mean  separable  Markov  process  which  is  wide-sense  stationary.  Specifically,  the  equations  for 
a  suboptimal  predictor  are  derived  as  follows:  The  optimal  predictor  is  given  by  Equation  (34). 
i.e.. 

B+(k)  =  Rgf(k)T  R/k)'1  .  (34) 

In  the  case  of  the  next  frame  prediction  using  only  the  last  frame  received,  i.e..  f~(k+l)  using  only 
f(k).  the  optimal  predictor  is  given  by 

B’(k)  =  E{f(k)  f ( k— 1  )T)  Eif(k-l)  fOc-l)1}-1  (38) 

=  (f(k)  f(k-l)T}  If(k-l)  f(k-l)Tr‘  ■  (39) 

We  derive  a  suboptimal  predictor  by  using  the  wide-sense  stationarity  on  the  covariance 
staticnarity  of  the  image  sequence  as  follows: 

Using  Equations  (18;  and  (39).  the  frame  estimate  is 

f(k)  =  { f(k— 1 )  f {k — 2 )T }  { f Ck — 1 )  f ( k — 1  )T}_1  f(k-l)  .  (40) 

Ac  we  see.  this  approach  would  require  at  least  a  couple  of  frames  in  order  tc  estimate  the  next 
frame  of  the  sequence  Equation  (34)  assumes  that  an  inverse  of  RgC k J  exists.  The  mathematical 
details  of  the  problem  are  discussed  in  the  next  section. 

4.3.  Application  of  Singular  Value  Decomposition  (SVD)  to  the  Image  Processing  Problem 

The  equation  for  trie  optimal  feedback  predictor,  b\  given  by  Equation  (34).  assumes  that  an 
inverse  of  R  (kj  exists.  When  we  apply  the  adaptive  predictor  algorithm  for  frame  prediction 


mm 
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using  the  last  two  frames  received  according  to  Equation  (40),  we  find  that  the  matrix  R?(k)  is 
singular.  This  is  because  it  has  many  zero  eigenvalues  as  seen  in  Figure  5 a.  We  then  try  to  make  it 
nonsingular  by  adding  a  small  constant  to  the  diagonal  of  R?(k).  Figure  5b  shows  the  resulting 
eigenvalues.  It  is  obvous  that  the  matrix  is  so  ill-conditioned  that  it  can  not  be  inverted.  In  such 
problems  involving  singular  matrices,  one  would  compute  the  generalized  inverse  which  is  the 
minimum  norm  solution.  Equation  (40)  may  be  rewritten  as 

f(k)  =  [f(k— 1)  f(k— 2)T]  (f(k-l)T)f  .  (41) 


where  represents  the  Moore-Penrose  pseudoinverse.  For  a  matrix  A,  the  Moore-Penrose  inverse  is 
given  by 

Af  =  (ATA)_1  AT  .  (42) 

Since  the  matrix  [f(k—  l)f(k— l)T]  is  singular,  its  rank  is  less  than  the  order  of  f(k-l)  or  the 
number  of  unknowns.  Hence,  there  is  no  unique  inverse  for  this  matrix.  In  the  case  of  an 
underconstrained  problem  requiring  a  solution  of  a  set  of  linear  equations  of  the  form 

Ax  =  b  . 

we  use  singular  value  decomposition  (SVD)  which  guarantees  a  minimum  norm  solution  to  the  set 
of  equations  represented  by  Ax=b.  Hence,  to  derive  an  optimal  predictor,  we  use  SVD  of  f(k— l)T 
to  find  a  unique  solution  to  an  otherwise  indeterminate  problem. 

The  SVD  problem  involves  spectral  decomposition  of  the  matrix.  A,  in  Equation  (42)  as 
follows.  In  the  case  of  real  matrix  A, 

A  =  U  I  VT  ,  (43) 

where  U  and  V  are  unitary  matrices  such  that  the  columns  of  U  and  V  are  composed  of  a  set  of 
orthonormal  eigenvectors.  U  is  the  row  eigenvector  system  of  A  and  V  is  the  column  eigenvector 
system  of  A.  The  matrix  £  is  a  diagonal  matrix  which  has  singular  values  of  A  (the  positive 
square-roots  of  the  non-zero  eigenvalues  of  ATA)  on  its  diagonal.  The  eigenvectors  u(  are  the 
spectral  components  of  the  observation  space  and  the  eigenvectors  v,  are  the  spectral  components  of 
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Figure  5.  Singularity  of  the  image  processing  matrices. 

(a)  The  eigenvalues  of  [f(k— l)  ffk— 1)‘],  where  f(k-l)  corresponds  to  the  first  32x32 
block  image  of  Image  2. 

(b)  The  eigenvalues  of  [f(k  —  l)  f(k — 1  )T  +  3600*  I],  where  f(k-l)  corresponds  to  the 
first  32x32  bicck  of  Image  2  and  I  is  the  identity  matrix. 


the  parameter  space.  The  value  of  each  \i  determines  the  amplitude  of  the  corresponding  spectral 
components  Uj  and  vr  The  generalized  inverse  to  the  matrix  A  of  Equation  (43)  is  defined  as 

A  =  V  rT  UT  .  (44) 


Using  Equation  (43).  the  generalized  inverse  solution  to  the  set  of  linear  equations.  Ax  -  b.  can  be 
expressed  as  a  linear  combination  of  the  eigenvectors  v  as 


x  =  Z  vi 

i=l 


T  u 

-u,  b 


(45) 


The  weighting  factors  in  this  linear  combination  are  the  quantities  in  the  square  brackets.  The 

T 

products  Uj  b  represent  the  amplitude  of  the  ith  spectral  component  contained  in  the  observations, 
b.  This  quantity  is  divided  by  the  response  Xs.  Thus,  the  quantity  in  the  brackets  represents  the 
amount  of  the  ith  spectral  component  contributing  to  the  solution  parameter-vector  x.  The  equation 
shows  that  the  observational  errors  or  model  errors  in  defining  the  system  dynamics  are  amplified 
by  a  factor,  1/Xj.  Thus  eigenvectors  that  are  poorly  represented  in  the  sense  that  they  are 
associated  with  small  values  of  cannot  be  determined  as  reliably  as  the  better  represented 
eigenvectors.  For  limiting  the  influence  of  the  eigenvectors  with  very  small  eigenvalues,  one 
approach  is  to  eliminate  the  small  eigenvalues  by  simply  adjusting  the  value  of  the  apparent  rank 
of  the  matrix  A.  This  is  done  via  SVD.  We  now  apply  this  approach  to  try  to  find  an  approximate 
inverse  of  f(k—  l)T  which  would  give  an  acceptable  prediction. 

When  we  use  the  SVD  of  f(k— l)T.  we  find  that  an  acceptable  solution  to  the  problem  is  still 
not  possible.  This  is  because  of  the  ill-conditioned  nature  of  the  matrix  which  makes  its  non-zero 
singular  values  vary  over  a  wide  range  as  is  seen  from  Figure  6.  The  values  in  Figures  5-7  relate  to 
prediction  of  Image  3  from  Image  2  and  Image  1.  When  we  attempt  to  restore  the  image  by 
discarding  the  zero  singular  values  and  use  the  non-zero  ones,  we  find  that  the  smaller  singular 
values  effectively  increase  the  contribution  of  the  noise  term  and  make  the  visual  quality  of  the 
image  unacceptable.  The  use  of  only  the  highest  singular  value  provides  acceptable  values  for  the 
objective  criteria  but  results  in  minimizing  a  lot  of  detail  in  the  reconstructed  picture.  We  attempt 
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Figure  6. 


Singularity  of  the  image  processing  matrices. 

la)  The  singular  values  of  f(k— 1)  .  where  f(k-l)  corresponds  to  the  first  32x32  block 
of  Image  2.  (b)  Pixel  intensities  corresponding  to  the  first  8x8  block  cf  Image  2. 
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Figure  7.  Singularity  of  the  image  processing^ matrices. 

<a)  The  singular  values  of  f(k— lp.  where  f(k-l)  corresponds  to  the  first  32x32  block 
of  Image  2  after  subtracting  the  local  mean  taken  over  a  5x5  pixel  window,  (b)  Pixel 
intensities  corresponding  to  the  first  SxS  block  of  Image  2  after  subtracting  the  local 
mean 
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to  lower  the  range  of  the  non-zero  singular  values  by  adjusting  the  matrix  f(k-l)  as  1  blows. 
Instead  of  choosing  f(k-l)  to  represent  the  pixel  intensities,  we  subtract  from  each  pixel  value  its 
local  mem  taken  over  a  5x5  pixel  window  centered  over  that  pixel.  The  resulting  values  for  an 
8x8  pixel  subblock  and  the  corresponding  singular  values  of  f(k— 1)‘  are  shown  in  Figure  7  The 
new  matrix  is  still  seen  to  be  ill-conditioned,  preventing  an  acceptable  solution  to  the  inverse 
problem. 

To  eliminate  the  mathematical  complexities  involved  in  the  derivation  of  the  optimal 
predictor,  we  derive  a  suboptimal  predictor  which  still  uses  the  past  frames  to  form  a  prediction. 
This  is  not  unreasonabe.  since  any  on-line  estimation  algorithm  must  be  computationally  simple  to 
implement.  The  equation  for  the  suboptimal  predictor  for  the  next  frame  prediction  using  the  last 
two  frames  is  as  fellows: 

f(k)  =  [f(k-l)  f(k— 2)T]  f(k — 1 )/  II  f(k-l)  II  2  (46) 

where  II  ■  II  represents  the  Frobenius  norm  . 

For  a  matrix  A. 

1  A  .1  =  trace  (ATA)  . 

The  results  derived  via  the  suboptimal  predictor  are  discussed  briefly  in  the  following  section  and 
in  greater  detail  in  Chapter  6. 

4.4.  Performance  Evaluation 

The  performance  of  the  suboptimal  predictor  is  summarized  in  Table  8  for  one-frame  ahead 
prediction,  using  the  last  two  frames  received.  The  corresponding  results  for  the  fixed  predictor  are 
given  in  Table  4.  and  for  frame  refresh  in  Table  7.  The  estimates  derived  via  this  pedictor  are 
shown  in  Chapter  6.  From  an  evaluation  of  the  mean-square  error  and  SNR  values,  we  can  see  that 
the  suboptima!  predictor  which  uses  the  past  statistics  to  derive  the  estimate,  matches  the 
performance  cf  the  fixed  predictor.  The  fixed  predictor  is  seen  to  give  better  criterion  values  than 
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:i  c-  i  rur.c  ref  rash  ir.  aim-,..'.;  a!!  cases'  However,  subjective  oual.iv  :s  seen  to  be  less  acceptable  than 
■.he  h  w  predictor.  There  docs  nut  appear  to  be  a  significant  advantage  cf  using  this  technique  over 
:;u.-..e  refresh  \*hicb  does  not  require  any  computations.  For  the  adaptive  predictor  case,  if  we  can 
fine,  a  computational  solution  to  the  problem  of  inverting  an  extremely  ill-conditioned  matrix,  then 
he  results  are  expected  to  be  better  than  what  the  fixed  predictor  provides.  In  situations  where  a 
direct  p>euc!e:n verse  algorithm  produces  unusable  restored  images  due  to  the  ill-conditioned  natue 
of  the  problem,  one  approach  suggested  in  [12]  is  to  apply  the  SVD  technique  to  both  row  and 
column  matrices  separately.  The  use  of  this  approach  for  reconstructing  the  images  merits  further 
investigation  and  is  a  topic  for  future  research  in  this  area. 


CHAPTER  5 


APPLICATION  OF  MOTION  ESTIMATION  FOR  IMPROVING  THE  ACCURACY 

OF  FRAME  PREDICTION 

5.1.  Introduction 

A  large  number  of  applications  requiring  image  processing  involve  images  of  moving  objects. 
For  example,  in  our  application  of  satellite  image  processing  for  rendezvous  and  docking,  the 
scenario  is  characterized  by  both  a  moving  target  as  well  as  a  moving  camera.  In  such  cases,  the 
nonstationarity  of  the  images  renders  the  image  prediction  a  more  complicated  process.  To 
accurately  predict  the  image  data  in  advance,  one  must  estimate  and  account  for  the  interframe 
motion.  A  number  of  researchers  have  addressed  this  issue  of  determining  the  motion  of  an  object 
from  a  sequence  of  images.  Motion  compensation  can  be  thought  of  as  a  filtering  process  where  the 
interframe  motion  is  considered  a  noise.  To  the  extent  that  the  signal  and  noise  have  different 
power  spectra,  one  can  filter  the  noise  representing  the  motion.  For  a  stationary  random  process, 
this  could  be  done  via  Wiener  filtering.  However,  scene  dynamics  cannot  accurately  be  modeled  as 
a  stationary  process.  Hence,  one  must  look  for  alternative  means  of  motion  compensation  in  the 
case  of  video  images. 

Dubois  and  Sabri  [13]  apply  temporal  filtering  to  detect  that  part  of  the  image  which  is 
nonstationary  between  frames,  and  use  motion  compensation  to  reduce  the  noise  in  image 
sequences.  Bowling  and  Jones  [14]  use  a  two-step  displacement  procedure  to  determine  pixel 
displacements  between  frames.  Broida  and  Chellappa  [15]  give  a  recursive  estimation  procedure  for 
determining  object  motion  parameters  from  a  sequence  of  images.  In  our  work,  we  applied  the 
algorithm  presented  by  Jain  and  Jain  in  [  1 6]  for  measuring  the  displacement  between  consecutive 
frames  in  integer  number  of  pixels.  There  are  algorithms  available  to  measure  the  pixel 
displacement  with  subpixel  accuracy.  Limb  and  Murphy  [  1 7]  present  such  an  algorithm  for 
measuring  the  speed  of  moving  objects  from  TV  signals;  Cafforio  and  Rocca  suggest  an  algorithm 
for  measuring  small  displacements  in  television  images  in  [  1 S].  The  algorithm  proposed  by  Jain 
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anti  Jain  entails  dividing  the  image  frame  into  subblocks  and  applying  a  2-dimensional  directed 
search  proceduie  to  find  the  displacement  which  optimizes  a  certain  criterion.  In  our  application, 
the  optimization  criterion  is  chosen  as  the  mean-  square  error  between  the  corresponding  biocks  of 
the  consecutive  frame  \  This  is  consistent  with  the  criteria  used  for  a  comparison  of  the  fixed  and 
adaptive  predictor  results.  This  displacement  is  applied  to  the  frame  estimate  to  improve  the 
accuracy  of  prediction.  The  details  of  the  algorithm  are  summarized  in  the  following  section. 


5.2.  Motion  Estimation 

Jain  and  Jain  (l6j  give  an  algorithm  for  measuring  the  interframe  motion  for  digitized  images. 
This  procedure  makes  use  of  the  fact  that  in  practice,  a  large  pa't  of  the  motion  in  a  scene  can  be 
approximated  by  piecewise  translation  of  several  areas  of  a  frame.  The  procedure  consists  of 
segmenting  an  image  into  fixed  size,  rectangular  blocks.  It  is  assumed  that  each  of  these  blocks  is 
undergoing  independent  translation.  Then  the  rotation  and  zooming,  etc.,  of  larger  objects  in  the 
scene  are  approximated  by  a  piecewise  translation  of  these  smaller  blocks. 

We  used  32x32  subblocks  consistent  with  the  block  sizes  used  for  other  algorithm 
computations.  Let  U  be  an  NxN  size  block  of  a  given  frame  in  a  sequence.  Let  V  be  an 
(X+2p)x(.\— 2p)  size  subblock  of  the  consecutive  frame  cf  the  sequence,  centered  at  the  same 
spatial  location  as  U.  Here,  p  is  the  maximum  displacement  allowed  in  either  direction  in  integer 
number  of  pixels.  A  mean  distortion  function,  the  mean-square  error  between  U  and  V.  is  used  as 
the  criterion  for  determining  the  direction  of  minimum  distortion  (DMD)  between  the  two  frames; 
in  other  words,  the  displacement  D  that  minimizes  the  mean-square  error  between  L  and  V. 


£  21  g(u(m,n)  — v(m  +  i,n+j))  .  — p  <  i.j  ^  p  .  where 


g(xl  =  x‘  .orresponds  to  the  mean-square  error  between  U  and  V  In  general.  g(x)  can  be  any 
positive  and  increasing  function  of  x.  We  use  the  mean-square  error  criterion  since  it  is  consistent 
with  the  performance  measures  used  in  this  work.  Also,  this  allows  us  to  obta.n  results  about 
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improvement  in  frame  prediction  in  mean-square  error  (or  signal-to-noise  ratio)  terms. 

The  following  assumption  is  made  about  the  mean  distortion  function.  If  D(q.Z)  =  min{D(i,j)} 

i.j 

then  for  m  =  i— q  ,  n  =  j — Z.  the  functions 

Dj(  I  m  I  ,  I  n  I  )  =  D(i,j)  —  D0(q.Z)  ,  m  ^  0  .  n  ^  0 

D2(  I  m  I  ,  I  n  I  )  =  D(i,j)  --  D0(q,Z)  .  m  ^  0  ,  n  ^  0 

D3(  I  m  I  ,  I  n  I  )  =  D(i.j)  —  D0(q,Z)  .  m  ^  0  .  n  ^  0 

D4(  I  m  I  ,  I  n  I  )  =  D(i.j)  —  D0(q,Z)  ,  m  ^  0  ,  n  ^  0  , 

where  1  •  I  represents  the  absolute  value.  The  above  assumption  means  that  the  distortion  function 
increases  monotonically  as  we  move  away  from  the  direction  of  the  minimum  distortion. 

With  the  above  assumption,  the  algorithm  [16]  uses  a  2-dimensional  directed  search  method 
for  finding  the  DMD.  The  search  consists  of  testing  five  locations  in  a  frame  at  a  time  and 
successively  reducing  the  area  of  search  until  the  plane  of  search  reduces  to  a  3x3  block.  In  the 
final  step,  all  nine  locations  are  searched  to  find  the  DMD  and  the  minimum  mean-square  error. 
The  DMD  and  the  minimum  mean-square  error  for  the  entire  frame  are  obtained  by  averaging  over 
all  subblocks. 

The  algorithm  is  as  follows: 

For  any  integer  p>0. 

N'(p)  =  { (i.j)  :  — p  ^  i.j  ^  p) 

M(p)  =  {(0.0)  .  (p.0)  ,  (O.p)  ,  (—  p.0)  .  (0, — p)}  . 

Step  1:  (Initialiaation) 

D(i.j)  =  oo  .  fi.j)  i  N(p) 

n  =  Clog2p].  [  1  is  a  lower  integer  truncation  function 

n  =  max{2.2r'  ’} 
q  =  Z  =  0  . 
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Step  2  M'(n)  —  Mini. 


Step  3:  f  ind  (i.j)  €  M‘(r.)  such  that  Dt  i-t-q.j+'j  is  minimum.  If  i=0  and  j=0.  go  to  Sum  5; 


otherwise,  go  to  Step 


Step  4:  q  q  +  i  .1  —  14  j  :  \I(n)  —  M'(n)  —  ( — i. — j):  go  to  Step  3. 


Step  5.  n  —  n/2.  !f  n=l.  go  to  Step  6:  otherwise,  go  to  Step 


Step  6:  fund  (i.j)  €  N(l)  such  that  D(i+q.j— f)  is  minimum.  The  DMD  is  then  given  b\ 


q  -  q+i  .  f  -  I+j. 


The  proof  of  the  algorithm  can  be  found  in  [ ! 6] . 


5.3.  Performance  Evaluation 


The  above  algorithm  is  used  tc  derive  the  displacement  between  the  estimates  derived  by  the 


fixed  predictor  algorithm  and  the  actual  frames.  The  results  for  some  of  these  estimates  are 


tabulated  in  Table  9.  The  improvement  in  the  prediction  is  apparent  from  the  reduction  in  the 


MSb  and  the  corresponding  increase  in  the  SNR.  It  is  shown  that  an  improvement  of 


approximately  2  dB  is  obtained  when  we  compensate  for  the  motion. 


We  find  that  even  though  the  displacement  measured  matches  'well  with  the  actual 


displacement  on  a  subblock  basis,  the  approximation  involved  in  measuring  the  displacement  in 


integer  number  of  pixels  degrades  the  accuracy  of  the  overall  displacement  when  averaged  over  the 


entire  image.  The  displacement  algorithm  can  be  applied  to  the  entire  image  without  too  much 


computational  burden.  However,  we  apply  ;t  to  32x32  blocks  for  consistency  with  the  other 


results.  It  is  obvious  that  wnere  averaging  is  expected  over  many  subblocks,  we  require  an 


algorithm  that  gives  the  displacement  with  fractional  pixel  accuracy.  The  works  of  Limb  and 


Murph;  [l7j.  and  CafForio  and  Rocca  [li]  are  concerned  w'ith  the  problem  of  measuring  small 


displacements  in  television  images. 


In  pract  re,  for  me  purpose  of  on-line  motion  compensation,  we  would  estimate  the 


interftame  displacement  between  the  two  most  recently  received  frames  and  aprly  it  to  the 
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Table  9.  (Continued)  Effect  of  Motion  Compensation  on  Interframe  Prediction 
Two-  and  Three-Step  Ahead  Prediction 
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reconstructed  frame.  A  better  result  is  expected  when  we  use  displaced  frames  as  opposed  to  the 
actual  frames  for  deriving  both  fixed  and  adaptive  predictors.  Since  the  displacement  algorithm  can 
be  applied  to  the  entire  image,  an  on-line  motion  compensation  for  an  adaptive  prediction  appears 
to  be  feasible  and  computationally  simple.  These  results  are  discussed  further  in  Chapter  6. 


CHAPTER  6 
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DISCUSSION  OF  TIIE  SIMULATION  RESULTS 


In  this  chapter,  we  discuss  the  results  obtained  from  the  fixed  and  the  adaptive  predictor 
algorithms.  As  described  in  Chapter  1.  in  order  to  demonstrate  the  performance  of  the  algorithms, 
we  use  a  set  of  5  frames  from  a  video  tape  of  a  spacecraft-to-spacecraft  docking  simulation.  Figure 
8  shows  the  8  frames.  These  frames  are  not  successive  frames  of  the  video  tape  with  the  typical 
frame  rate  being  30  frames  per  second,  but  were  taken  at  different  stages  in  the  docking.  We 
selected  frames  representing  far-,  mid-,  and  close-range  in  order  to  determine  the  robustness  of  the 
approaches  for  different  situations.  For  example,  a  change  in  the  camera  azimuth  or  elevation 
introduces  new  and,  therefore,  unpredictable  area  due  to  the  presence  of  unmodeled  dynamics.  The 
first  3  frames  represent  the  far-range.  The  zooming  effect  is  present  between  images  5  and  6.  In 
practice,  however,  a  remote  pilot  can  expect  successive  frames  with  small  interframe  displacement 
due  to  the  very  small  velocity  of  the  spacecraft  in  the  docking  phase.  Hence,  the  results  obtained 
in  this  work  represent  conservative  estimates. 

The  results  of  the  fixed  predictor  for  up  to  3  steps  of  prediction,  and  the  next  frame  predicts,  n 
using  one,  two.  and  three  of  the  most  recent  frames  are  summarized  in  Tables  1-5  using  the  criteria 
defined  in  Chapter  2,  namely.  MABSE,  MSE,  9o  NMSE.  and  SNR.  The  corresponding  pictures  of  the 
reconstructed  images  are  shown  in  Figures  8-13,  respectively.  In  some  cases,  we  present  only  the 
first  picture  of  the  sequence  to  limit  the  total  number  of  pictures,  but  the  analysis  was  carried  out 
for  all  cases  in  all  categories  and  the  results  are  summarized  below  the  figures  to  give  an  idea  of  the 
relative  subjective  merit  of  the  other  estimates.  For  comparison,  the  results  of  the  frame  refresh 
approach  of  prediction  are  summarized  in  Tables  6  and  7  for  one  and  two  steps  of  prediction, 
respectively.  The  side  effects  of  processing  the  images  in  subblocks  cause  minor  degradation  in  the 
reconstructed  picture  quality,  mainly  in  the  form  of  a  border  effect  (more  noticeable  in  the  disc 
towards  the  right).  This  is  caused  by  the  partitioning  of  images  into  32x32  subbloc.-.s  fo: 
proces'  ng  and  thereafter  pasting  the  processed  blocks  to  obtain  the  complete  .mage.  On  the 
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''continued)  Next-frame  prediction  via  fixed  predictor  technique.  The  resulting  images 
and  the  error  between  the  original  and  the  reconstructed  frame  represented  by  the 
signal-to-noise  ratio  (SN'R).  (e)  Image  6  from  Image  5.  SNR  =  27.1  dB.  (f)  Image  8 
from  Image  7.  SNR  -  28.4  dB. 


T  wo-frame  ahead  prediction  via  fixed  predictor  technique.  The  resulting  image  and  the 
erro;  between  the  original  and  the  reconstructed  frame  represented  by  the  signal-to- 
noise  ratio  (SNR).  Image  3  from  Image  1,  SNR  =  33.1  dB. 

Note:  SNR  values  for  the  ether  images  are  as  follows:  Image  4  from  2.  SNR  =  31.7  dB: 
Image  5  from  3,  SNR  =  32.3  dB:  Image  6  from  4,  SNR  27.1  dB:  Image  7  from  6,  SNR  = 
26.1  dB:  Image  S  from  6.  SNR  =  27.2  dB. 
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Figure  11.  Three-frame  ahead  prediction  via  fixed  predictor  technique.  The  resulting  image  and 
the  error  between  the  original  and  the  reconstructed  frame  represented  by  the  signal- 
to-noise  ratio  (SNR).  Image  4  from  Image  1.  SNR  =  31.9  dB. 

Note:  SN'R  values  for  the  other  images  are  as  follows:  Image  5  from  Image  2,  SNR  = 
31.3  dB;  Image  6  from  Image  3.  SNR  =  27.1  dB;  Image  7  from  Image  4,  SNR  =  26.5  dB: 
Image  8  from  Image  5.  SNR  =  24.9  dB. 
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Figure  13.  (Continued)  Next-frame  prediction  via  fixed  predictor  technique  using  last  three 
frames  received.  The  resulting  images  and  the  error  between  the  original  and  the 
reconstructed  frame  represented  by  the  signal-to-noise  ratio  (SNR). 

(d)  Image  7  from  Images  6.  5.  and  4,  SN'R  =  28.5  dB.  (e)  Image  8  from  Images  7,  6.  and 
5.  SNR  -  27.9  dB. 
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monitor,  we  can  measure  that  distance  to  be  exactly  32  pixels  wide.  This,  however,  does  not  make 
the  image  unacceptable. 

From  an  objective  evaluation  of  the  fixed  predictor  versus  frame  refresh.  we  can  infer  that  the 
fixed  predictor  performs  better  than  the  frame  refresh  in  almost  ail  cases.  However,  the  subjective 
quality  of  the  reconstructed  images  is  such  that  there  does  not  seem  to  be  a  significant  advantage  of 
using  this  predictor  with  its  attendant  computations.  The  significant  limitation  of  the  fixed 
predictor  is  in  those  situations  where  the  next  frame  has  new  and.  therefore,  unpredictable 
information.  The  structure  of  the  model  chosen  in  Chapter  3  does  not  allow'  us  to  compensate  for 
such  physical  phenomena  as  tilting  or  pan.  which  introduce  new  information.  This  is  obvious  from 
the  lower  criterion  values  and  poor  quality  cf  the  estimates  when  predicting  certain  images 
compared  to  predicting  others.  For  example,  the  estimate  of  image  6  from  5  is  less  accurate 
compared  to  the  other  estimates.  When  we  apply  the  displacement-measurement  algorithm 
described  in  Chapter  5  to  the  estimates,  we  find  that  improvements  in  SNR  of  the  order  cf 
approximately  2  dB  are  achievable.  The  results  are  summarized  in  Table  9  for  a  set  of  cases.  This 
is  a  significant  gain  since  the  results  thus  obtained  are  approximately  3  dB  better  than  those 
available  with  the  frame  refresh  technique.  The  displacement  compensation  procedure  assumes 
that  over  a  short  sequence  of  frames,  the  interframe  displacement  can  be  assumed  constant.  Then, 
for  the  prediction,  we  would  use  instead  cf  the  actual  frames,  frames  displaced  by  the  same 
amount  as  the  displacement  between  the  preceding  two  frames.  In  this  work,  we  used  only  actual 
frames.  Here,  it  is  appropriate  to  point  out  that  an  algorithm  that  measures  frame  displacement 
with  fractional  pixel  accuracy  is  required  to  give  an  accurate  estimate  fer  the  mage  when  averaged 
over  all  its  subblocks.  Otherwise,  me  approximation  over  individual  subblocks  makes  the  overall 
estimate  less  accurate.  In  the  specific  application  that  we  are  considering,  namely,  that  of  satellite 
rendezvous  and  docking  operation,  the  scene  is  characterized  by  very  slow  motion  which  would 
make  the  intenrame  displacement  considerably  smaller  than  what  we  see  here.  As  mentioned 
before,  the  frames  are  not  successive  frames  from  the  tape,  but  were  deliberate''.’  cho^e '.  to  ''e 
several  frame:;  apart.  In  eithei  case,  the  algorithms  are  not  very  complex,  henc,.  an  on-line 


movement-compensated  prediction  appears  to  be  feasible  and  practical.  This  area  requires  further 
investigation  as  it  is  expected  to  give  better  visual  results  than  what  we  have  obtained  from  the 
fixed  predictor. 

Another  conclusion  that  we  can  draw  from  an  evaluation  of  the  estimates,  and  the  tables 
summarizing  the  criterion  values,  is  that  one-step  ahead  prediction  almost  always  performs  better 
than  two-  and  three-step  ahead  predictions  in  that  order.  Where  the  interframe  motion  is  small, 
e.g..  between  images  1  and  2.  the  prediction  is  more  accurate  than  in  the  other  cases  (Figure  9).  An 
evaluation  of  the  results  of  the  two-  frame  ahead  prediction  (Figure  10)  shows  a  good  picture  with 
an  attendant  SNR  value  of  33.1  dB.  Three-frame  ahead  prediction  does  not  appear  to  be  as  good  as 
the  other  two  (Figure  11).  A  subjective  evaluation  of  Figure  12  (next-frame  prediction  using  last 
two  frames  received)  shows  that  the  prediction  of  image  3  from  2  and  1,  image  7  from  6  and  5,  and 
image  8  from  7  and  6  are  the  acceptable  ones.  This  again  appears  to  be  a  result  of  the  interframe 
displacement.  There  also  appears  to  be  some  duplication  of  feature  points,  more  apparent  in  some 
cases  than  the  others.  This  is  because  prediction  using  the  last  two  frames  received  can  be  thought 
of  as  a  projection  of  the  last  two  frames  onto  the  next  frame.  A  change  in  the  location  of  certain 
feature  points  between  consecutive  frames  results  in  a  shadow  effect  on  the  frame  estimate.  The 
results  of  the  next  frame  prediction  using  the  last  three  frames  received  is  the  worst  in  terms  of  the 
visual  perception  of  the  scene.  The  blurring  which  makes  the  target  interpretation  almost 
impossible  is  caused  by  that  portion  of  the  image  which  is  nonstationary.  This  can  be  inferred 
from  the  fact  that  in  that  class,  the  acceptable  estimates  are  those  of  image  8  (from  7,  6.  and  5)  and 
image  7  (from  6.  5,  and  4).  In  the  original  frame  sequence,  the  most  significant  interframe 
displacement  is  between  images  2  and  3.  which  is  why  the  other  estimates  seem  so  blurry.  In 
almost  all  the  cases,  the  SNR  is  approximately  30  dB. 

The  results  of  the  suboptimal  adaptive  predictor  are  summarized  in  Table  8.  The 
corresponding  pictures  are  presented  in  Figure  14.  From  the  objective  evaluation,  the  suboptimal 
predictor  matches  the  performance  of  the  fixed  predictor  which,  as  we  have  seen,  outperforms 


Figure  14.  (Continued)  Next-frame  prediction  via  a  suboptimal  adaptive  predictor  technique 
using  last  two  frames  received.  The  resulting  images  and  the  error  between  the  origi¬ 
nal  and  the  reconstructed  frame  represented  by  the  signal-to-noise  ratio  (SN'R). 

(c)  Image  5  from  Image  4  and  3.  SNR  *  32.1  dB. 

(d)  Image  6  from  Image  5  and  4,  SNR  =  27.0  dB. 


frame  refresh.  However.  the  approximation  of  the  inverse  transformation  used  in  the  derivation  cf 


*- he  predictor  introduces  noise  in  the  image  recovery  which  is  seen  as  the  blurring  effect.  The  only 
acceptable  estimate  1:1  this  category  is  that  of  image  3  based  on  images  2  and  1.  When  we  display 
the  original  images  1.  2.  and  3.  we  find  that  the  relative  displacement  between  1  and  2  matches 
closely  with  that  of  2  and  3  This  suggests  that  in  cases-  where  the  assumption  of  wide-sense 
stationunty  is  valid,  the  prediction  is  indeed  feasible  and  of  an  acceptable  quality.  As  described  in 
Chapter  4.  an  investigation  of  the  solution  of  singular  matrices  is  in  order  since  that  is  expected  to 


improve  on  the  results  obtainable  with  a  fixed  predictor  as  well  as  a  suboptimal  adaptive  predictor. 


The  aim  of  this  work  is  to  investigate  if  it  is  realistic  to  model  the  scene  dynamics  as  a 
discrete-time  linear  slat?  vector  model,  and  estimate  the  model  structure  which  is  otherwise  either 
completely  unspecified  or  ill-defined.  We  do  this  by  either  exploiting  some  inherent  properties  of 
the  images  by  trying  to  estimate  the  dynamics  via  an  analysis  of  the  available  information.  In 
either  case,  the  subjective  quality  cf  the  reconstructed  estimates  is  such  that  the  computations 
involved  ir.  deriving  the  predictors  may  not  be  justified.  However,  in  some  other  cases,  such  as 
movement-compensated  predictom.  it  may  be  possible  to  obtain  far  better  results  in  which  case  the 
advantage  over  frame  refresh  may  warrant  the  added  computations.  This  area  merits  further 
investigation.. 


CHAPTER  7 
CONCLUSIONS 

The  problem  of  dynamic  estimation  of  unknown,  time-varying  parameters  using  a  priori 
information  is  considered.  The  unknown  parameters  are  the  image  pixel  intensities  of  the  next 
frames  of  a  video  sequence.  Also,  the  dynamics  of  the  model  are  not  available  and  must  be 
estimated.  The  measurable  data  are  the  pixel  intensities  of  the  preceding  frames.  We  approach  the 
problem  by  first  representing  the  image  dynamics  as  a  linear  state  space  vector  model  where  each 
frame  is  represented  as  a  state  vector  of  dimension  N(  =  \T]XN2),  sampled  at  discrete  instants.  We 
then  attempt  to  use  the  a  priori  information  to  estimate  the  dynamics  of  the  model  and  to  derive  an 
on-line  adaptive  estimation  of  the  unknown  parameters.  In  the  fixed  predictor  case,  we  use  the 
inherent  adjacent-pixel  correlation  of  the  image  data  for  deriving  the  state  vector  model.  The 
intent  of  the  work  is  to  determine  if  it  is  realistic  to  model  the  image  dynamics  via  these 
approaches  and  if  so,  to  investigate  if  it  is  possible  to  obtain  a  significant  improvement  over  the 
frame  refresh  technique  which  requires  no  computations  at  all. 

We  find  that  it  is  feasible  to  do  an  on-line  prediction  of  the  image  data  and  if  certain 
conditions  such  as  slow  dynamics  are  satisfied,  then  it  is  possible  to  match  the  performance  of  the 
frame  refresh  technique  according  to  the  objective  criteria.  The  conditions  assumed  are  realistic  in 
the  context  of  teleoperator-controlled  remote  piloting  application  which  is  the  main  motivation  for 
this  work.  It  is  also  shown  that  in  the  case  of  slow  dynamics,  we  can  improve  the  criterion  values 
by  predicting  the  interframe  displacement  and  compensating  for  it.  However,  from  a  subjective 
evaluation  of  the  results  of  the  fixed  and  adaptive  predictors  versus  the  frame  refresh,  we  find  that 
there  is  not  a  tremendous  improvement  over  the  frame  refresh  technique.  The  latter  is  attractive 
since  it  does  not  require  any  computations.  We  also  find  that  a  significant  limitation  of  the  fixed 
predictor  approach  is  in  those  situations  where  the  scene  is  characterized  by  rapid  movement  which 
introduces  new  information  in  the  form  of  unmodeled  dynamics.  This  indicates  that  the  pixel 
correlation  by  itself  may  not  be  sufficient  to  model  the  image  dynamics.  In  such  cases,  the  use  of 
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an  optimal  adaptive  predictor  would  be  in  order  provided  a  computationally  simple  solution 
approximation  to  the  problem  of  inverting  singular  matrices  could  be  found.  The  suboptir.tal 
predictor  used  in  this  case  gives  good  criterion  values.  However,  the  matrices  representing  the 
image  pixel  intensities  are  seen  to  be  extremely  ill-conditioned  making  them  singular.  liven  a 
generalized  inverse  solution  fails  to  provide  acceptable  estimates.  This  is  because  the  approximation 
used  for  the  generalized  inverse  of  the  ill-conditioned  matrices  greatly  degrades  the  visual  quality 
of  the  images  by  effectively  increasing  the  contribution  of  the  noise  v/hich  represents  the 
unmcdeled  dynamics. 

From  the  video  sequence  that  was  analyzed  and  the  specific  predictors  used  in  this  work,  it 
appears  that  the  improvement  over  frame  refresh  is  not  significant  enough  to  warrant  the 
computations  required  for  these  approaches:  this  does  not  preclude  the  possibility  of  obtaining 
significant  improvement  when  using  other  approaches.  Two  areas  that  merit  further  investigation 
in  this  regard  are  as  follows: 

Movement-compensated  on-line  prediction  appears  to  be  a  promising  approach.  Our  work  in 
Chapter  5  shows  that  it  is  possible  to  improve  on  the  criterion  values  by  estimating  interframe 
displacement  and  compensating  for  it.  It  is  expected  that  if  one  were  to  use  the  displaced  frames 
instead  of  the  actual  frames  for  image  data  prediction  and  thus  account  for  the  interframe 
displacement,  then  the  predicted  frames  would  be  more  accurate  than  what  one  could  get  from  the 
frame  refresh.  This  is  a  topic  for  future  research  in  this  area. 

Another  area  which  merits  furt.ier  investigation  is  the  problem  cf  the  ill-conditioned  nature 
of  the  image  processing  matrices.  Andrews  and  Patterson  [19]  and  Huang  [20]  suggest  an  approach 
for  solving  this  problem  where  a  direct  pseudoinverse  gives  unacceptable  restored  images  due  to  the 
ill-conditioned  nature  of  the  matrices.  This  approach  involves  representing  the  image  mode!  as  a 
sr:  arable  space-variant  model  and  consists  of  applying  S\  D  to  both  row  and  column  matrices 
separately.  This  is  equivalent  to  decomposing  the  images  into  eigenimages  and  reconstructing  the 
image  by  selectively  discarding  certain  eigenimages.  This  area  requires  further  investigation  as  an 
amolicati-  n  of  „  -eralized  inverses  for  image  oro*essing  systems. 
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APPENDIX  A 


SUPPORTING  SOFTWARE 


A.l.  Software  for  Fixed  Predictor  and  Data  Manipulation  for  D/A  Conversion 


program  pred 


PURPOSE: 

This  routine  computes  the  next  frame  of  the  video 
sequence  using  the  last  frame  received  and  prepares 
the  frame  estimate  for  D/A  conversion  by  converting 
it  from  real  to  binary  data. 


ROUTINES  CALLED:  MTXMLT 


LOCAL  VARIABLES,  INCLUDING  DIMENSIONS  AND  DESCRIPTION: 

X(  1 :  32 , 1 :  32  )  ,  Y(l:32,l:32)  -  input  matrices  which  hold 
32x32  block  of  Image  1  and  Image2  respectively 
after  converting  integers  to  real  numbers. 

M(  1 : 32 , 1 :  32 ) ,  N(l:32,l:32)  -  input  matrices  which  hold 
32x32  block  of  Image  1  and  Image2  respectively  after 
converting  binary  numbers  to  integers. 

XEST( 1 : 32 , 1 : 32 )  -  matrix  containing  the  estimate  derived. 

PHI ( 1 : 32 , 1 : 32 )  -  matrix  representing  the  fixed  predictor. 

Datal(512),  Data2(512),  Xsdata(512)  -  512  byte  arrays  to 
hold  one  record  of  imagel,  image2  and  the  estimate 
respectively. 

Xdata(512)  -  integer  array  to  hold  the  estimate  data 
after  real  to  integer  conversion. 

XSl.dat  -  XS8.dat  -  files  to  hold  the  reconstructed 

image  as  it  is  being  created  column-  by-  column. 

■  *********** ★**★★*****★* ************************* * * 


implicit  real*8  (a-h,o-z) 
implicit  integer*4  (i-n) 
byte  datal(512) 
byte  data2  (512) 
byte  xsdata  (512) 
integer  *2  xdata(512) 
integer  *4  M(l:32,l:32) 
integer  *4  N(l:32,l:32) 
real  *8  PHI ( 1 : 32 , 1 : 32 ) 
real  *8  X(l:32,l:32) 
real  *8  XEST( 1 : 32 , 1 : 32 ) 
real  *8  Y( 1 : 32 , 1 : 32 ) 


character  *20  Imagel,  Image2 


write  (6,*)  'Type  filename. typ  for  the  first  file 
read  (5,1)  Imagel 
format  (A) 

write  (6,*  )  'Type  filename. typ  for  the  next  file 
+  of  the  sequence . ' 
read  (5,2)  Image2 
format  (A) 

Open  all  the  units 
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do  i=3 , 30 , 27 
do  k=l , 4 

phi ( i , i )  =  1.0/  6.5 
if  ( ( i+k )  . le .  32 )  then 
phi ( i ,  i  +  k)  =  (0.96  **  k)  /  6.5 
end  i  f 

if  ( (  i-k )  . ge .  1  )  then 

phi ( i , i-k)  =  (0.96  **  k)  /  6.5 

end  i  f 

end  do 

end  do 

do  j=4,29,25 
do  k=l,4 

phi ( j , j )  =  1.0/  7.38 
if  ((j+k)  .le.  32)  then 
phi ( j , j  +  k) =  (0.96  **  k)  /  7.38 
end  i  f 

if  ( ( j -k )  . ge .  1  )  then 
phi ( j , j -k )  =  (0.96  **  k)  /  7.38 
end  i  f 
end  do 
end  do 
do  i=5,28 
do  k=l,4 
phi ( i , i )  = 
if  ( (  i  +  k ) 
phi  (  i ,  i  +  k)  = 
end  i  f 

if  ( (  i-k )  . ge 


1.0/8.23 
le.  32)  then 

(0.96  **k )  /8 . 23 

1  )  then 


phi  ( i ,  i-k)  =  (0.96  **  k)  /  8.23 
end  i  f 
end  do 
end  do 

Read  Imagel 
npin  =  512 
npout  =  32 
nrout  =  32 

Set  the  initial  values  for  performance  measures 
suml  =  0.0 
sum2  =  0.0 
sum3  =  0.0 
sum4  -  0.0 
sum5  =  0.0 
sum6  =  0.0 
sum  7  =0.0 
sum8  =  0.0 
snr  =  0.0 

Start  processing  column  1:  8 
do  10  npskip  =  0,224,32 

Start  processing  a  32x32  block  in  the  first  column 
do  20  k  =  1 , 8 


/Vj 


•  ‘V. 


■'v.a 

».  i 


•fil 

•il 

v!) 

»• p 


til 


ss 


I 

ft? 

,».'5 


Ww 


Read  next  trout  (=52  in  tms 
do  i  =  1 ,  nrout 

read  (11)  (datal(j),  j  =  l,npir. ) 
do  j  =  1 ,  npout 

M  (  i ,  j  )  =  j  z  e  x  t  (  datal(npskip-rj)  ) 
X  (  i  ,  j  )  =  d  f  1  o  t  j  (  M  (  i  ,  j  )  ) 
end  do 
end  do 

Cross-check . 
write  (6,3)  K , X  <  1,1) 

format  (/,lx,'3LCCK  if  '  ,  14  ,  '  ,  X  (  1 ,  1 )  = 
call  mtxmlt  ( phi , 32 , 32 , X , 32 , 32 , X3ST ) 
Start  writing  a  32x32  block  at  a  tim 


case)  recc 


14,',  X ( 1 , 1 )  =  '  , 
X, 32 , 32 ,X2ST) 
)lock  at  a  time, 


F9  .  3 


Start  writing  a  32x32  block  at  a  time,  i.e., 
npskip^l  :  npskip+32  after  doing  real  to  integer 
conversion  (integer  data  written  to  binary  files! 
do  kp= 1,32 

if  (npskip  . eq.  0)  then 
go  to  303 

else  if  (npskip  . eq .  32)  then 
read  (12)  (xsdata(j),  j  =  1, npskip) 
else  if  (npskip  .eq.  64  )  then 
read  (14)  (xsdata(j),  j  =  1 , npskip) 
else  if  (npsk-p  .eq.  96)  then 
read  (15)  (xsdata(j),  j  =  1, npskip) 
else  if  (npskip  .eq.  128)  then 
read  (16)  (xsdata(j),  j  =  1, npskip) 
else  if  (npskip  . eq .  160)  then 
read  (17)  (xsdata(j),  j  =  1, npskip) 


32)  then 
j  =  1, npskip) 
64  )  then 
j  =  1 , npskip) 
96)  then 
j  =  1 , npskip) 


read  ( 17 )  ( xsdata ( j 


, npskip) 


else  if  (npskip  .eq.  192)  then 
read  (15)  (xsdata(j),  j  =  1, npskip) 
else  if  (npskip  .eq.  224)  then 
read  (19)  (xsdata(j),  j  =  1, npskip) 


do  1=1,32 

xdata ( npsk ip+ 1 )  =  iidnnt(  xestlkp,! 
if  ( xdata ( npsk ip- 1 )  .ge.  127)  then 
xdata ( npsk io  + 1 )  =  ( xdat a ( nosk ip+ 1 ) ) 


xscata  (nosxi; 


xdata ( nosk io- 1 ) 


ena  co 

if  (npskip  .eq.  0)  then 

write  (12)  i  xsdata  i  j  )  ,  j  =  1 ,  (  npsk  i::4- 

else  if  (npskip  .eq.  32)  then 

write  (14)  (  xsdata ( j ) ,  j  =  . , ( npskip* 

else  if  'npskip  . eq .  64)  then 

w  r  i  - ■=■  (13)  (  xsdata  ( j )  ,  j  =  1 ,  '  .npskip* 

else  if  (r.pskip  .  eq .  93)  then 

write  (16)  (  xsdata  (  j  )  .  j  =  l  ,  '  r.pskip* 

else  if  (npskip  . eq .  123)  then 

write  (17)  ■.  xsdata  (  j  )  ,  j  =  1 ,  (  nos-:  ip*- 
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else  if  (npskip  .eq.  160)  then 

write  (18)  (  xsdata(j),  j =1 , ( npskip+32 )  ) 

else  if  (npskip  .eq.  192)  then 

write  (19)  (  xsdata(j),  j =1 , ( npskip+32 )  ) 

else  if  (npskip  .eq.  224)  then 

write  (20)  (  xsdata(j),  j =1 ,( npskip+32 )  ) 

end  i  f 

end  do 

Cross-check . 
write  (6,4)  xest(l,l) 
format  ( lx, ’ XEST( 1 , 1 )  = ’ , F9 . 3  ) 

Now  read  IMAGE2.DAT 

Read  nrout  (=32  in  this  case)  records  at  a  time! 
do  i  =  1 ,  nrout 

read  (13)  (data2(j),  j=l,npin) 
do  j  =  1 ,  npout 

N(i,j)  =  jzext(  data2 ( npsk ip+ j )  ) 

Y  (  i  ,  j  )  =  df lot j (  N ( i , j )  ) 
end  do 


end  do 

Cross-check . 
write  (6,5)  Y( 1 , 1 ) 
format  (IX, 'Y(l,l)=' , F9 . 3 ) 

Compute  mabse,  mse,  %  nmse  and  SNR. 
do  i  =  1,32 


do  j  =  1 , 3  2 

suml  =  suml  +  (dabs  (Y(i,j)  -Xest(i,j))) 
sum2  =  sum2  +  (  Y(i,j)  -Xest(i,j)  )  **  2.0 
sum3  =  sum3  +  (Y(i,j)  **  2.0) 
end  do 
end  do 

Close  the  loop  for  each  32x32  block  (total  8). 
continue 

After  8  iterations,  close  the  loop  for  individual 
columns  (total  8),  i.e.,  npskip+1:  npskip+32,  and 
print  the  performance  measures  for  each  column 
sum4  =  sum2  /  sum3 
suml  =  suml  /  (1024  *  8) 
sum2  =  sum2  /  (1024  *  8) 
sum3  =  sum3  /  (1024  *300) 
write  (6,7)  suml,  sum2,  sum4 ,  sum3 
format  (lx, 'MA3SE=' ,E9.3,lx, ’MSE=' ,E9.3,lx, ' NMSE= ' ,E9.3, 
•  lx,  'MEAN  IMAGE  VAR.  /100  =’,E12.6) 

Reinitialize  the  performance  measures  for  each 
column  iteration 
sum5=  sum5  +  suml 
sum6  =  sum6  +sum2 

sum/  =  sum7  +  sum3 

suml  =  0.0 

sum 2  =  0.0 


sum 3  =  0.0 
sum  4  =  0.0 

Close  units  11,12,13  and  reopen  in  order  t  u 
start  reading  records  from  the  1st  record, 
rewi nd  (11) 
rew i nd  (13) 

if  (npskip  . eq .  0)  then 
close  (12) 

open  ( un i t= 12 , f i le= ' xsl . dat ' , status= ' old ' , form  = 
'unformatted',  readonly,  recordtype= ' f ixec ' ,  recl-8) 
else  if  (npskip  .eq.  32)  then 
close  (14) 

open  ( un i t=l 4 , f i le= ' xs2 . dat ', status- ' old ', form  = 

' un format  ted ', readonly ,  recordtype= ' f ixed' ,  recl  =  16) 
else  if  (npskip  .eq.  64)  then 
close  ( 15 ) 

open  ( un i t = 1 5 , f i le= ' xs3 . da t ' , status= ' old ' ,  form= 

' unformatted ', readonly ,  recordtype= ' f ixed ' ,  recl=24) 
else  if  (npskip  . eq .  96)  then 
close  (16) 

open  ( uni t  =  lb , f ; le= ' xs4 .dat ', status= ' old ', form  = 

' unformat  ted ', readon ly ,  recordtype= ' f ixed' ,  recl  =  32) 
else  if  (npskip  .eq.  128)  then 
close  (17) 

open  ( uni t=17 , f i le= ' xs5 . dat ', status= ' old ', form  = 

' unformatted' , readonly ,  recordtype= ' f i xed ' ,  recl=40) 
else  if  (npskip  .eq.  160)  then 
close  (18) 

open  (unit=18 , f ile= ' xs6.dat ', status= ' old' , form  = 

' unformatted ', readonly ,  recoratype= ' f ixed ' ,  reci=48) 
else  if  (npskip  .eq.  192)  then 
close  (19) 

open  ( uni t=19 , f i le= ' xs7 . dat ', status= ' old ', form  = 

' unformatted ', readonly ,  recordtype= ' f ixed ' ,  recl=56) 
else  if  (npskip  .eq.  224)  then 
close  (20) 
end  i  f 
cont i nue 

write  (6,11)  Image2 ,  Imagel,  Npskip 
format  (lx, 'THIS  PROGRAM  PREDICTS A , ’  FROM  ',  A  , 

'  FOR  A  256X32  BLOCK  AFTER  DELET I NG '  , I  4 ,  '  COLUMNS.', 
/, 'FINISHED  PROCESSING  THE  FIRST  256X256  IMAGE') 

SUM5  =  SUM5  /  8. 

SUM 6  =  SUM6  /  8 . 

SUM  7=  SUM7  /8 . 

SUMS  -  SUM6/SUM7 

SNR  =  SNR  +  ((255  **  2.0'/  SUMo) 

SNR  =  10.0  *  dlogiO  (snr) 
write  (6,12)  sum5,  sum6 ,  sum7 
format  (lx, 'TOTAL  MA3SE= ' , E9 . 3 , 1 x , ' TOTAL  MSE- ' , E9 . 3 , i x , 
'MEAN  IMAGE  VAR./ 100  OVER  THE  ENTIRE  BLOCK  =',S12.b) 

write  ( o ,  13)  sum3,snr 

format  (  IX,  '  °5  TOTAL  NMSE--  ,  E9  .  3  ,  IX  ,  '  SNR=  ’  ,  •  6  .  3  ) 
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subroutine  mtxmlt (mtxa , nrowa , ncola ,mtxb, nrowb, ncolb,mtxc ) 
c 

c  PURPOSE: 

c  This  subroutine  computes  the  matrix  product  of  two  matrices, 
c  CALLING  SEQUENCE: 

c  CALL  MTXMLT  ( MTXA , NROWA , NCOLA , MTXB , NROWB , NCOLB , MTXC ) 

c  MTXA:  (input)  first  matrix  of  product,  maximum  dimension 

c  MTXA(  1  :  32,1  :  32  )  . 

c  NROWA:  (input)  number  of  rows  in  MTXA. 

c  NCOLA:  (input)  number  of  columns  in  MTXA. 

c  MTXB:  (input)  second  matrix  of  product,  maximum  dimension 

c  MTXB( 1:32,1:32). 

c  NROWB:  (input)  number  of  rows  in  MTXB. 

c  NCOLB:  (input)  number  of  columns  in  MTXB. 

c  MTXC:  (output)  resultant  product  matrix  of  MTXA  times 

c  MTXB,  maximum  dimension  MTXC ( 1 : 32 , 1 : 32 ) . 

c  ROUTINES  CALLED:  none, 
c  LIMITATIONS: 

c  The  maximum  dimension  of  the  input  and  output  matrices 
c  is  32x32 . 
c 

c  *******************  END  OF  PREFACE  ****************************** 
implicit  real*8  (a-h,o-z) 
implicit  integer*4  (i-n) 

real*8  mtxa ( 1 :  nrowa ,  1 :  ncola )  ,  mtxb(  1 :  r.rowb,  1 :  ncolb) 
real*8  mtxc ( 1 : nrowa , l:ncolb) 
if  ( ( ncola-nrowb) . eq. 0 )  then 

if  ( ( nrowa . le . 32 ) . and. (nrowb. le . 32 ) . and . ( ncola . le . 32 ) . and . 

>  ( ncolb. le . 32 ) )  then 
do  100  i  =  1, nrowa 

do  200  j  =  1, ncolb 
mtxc ( i , j  )  =  0.0 
do  300  k  =  1, ncola 

mtxc(i,j)  =  mtxc(i,j)  +  mtxa ( i , k ) *mtxb( k , j ) 

300  continue 

200  continue 

100  continue 

else 

write(20,9000) 
end  i  f 
else 

write  (20,9010)  ncola,  nrowb 
write  (20,9020) 
end  if 

9000  format  (/,'  dimension  of  one  or  both  MTXMLT  input', 

>  '  matrices  are  larger  than  32x32.') 

9010  format  (/,'  mtxa  has',i2,'  columns;  mtxb  has  ' , i 2 , 

>  '  rows .  '  ) 

9020  format  (/,'  matrix  multiplication  is  not  defined  in', 

>  '  this  s i tuat ion . ' ) 
return 

end 


A.2.  Software  for  Interframe  Displacement  Estimation 

program  motion 
c 

c  PURPOSE:  This  routine  computes  the  interframe  displacement 
c  between  2  frames  of  a  video  sequence  using  a  minimum  mean 
c  square  error  criterion  accoding  to  the  algorithm  by  Jain 
c  and  u  a i n . 

c  LOCAL  VARIABLES  INCLUDING  DIMENSIONS  AND  DESCRIPTION: 
c 

c  X ( 1 : 3 2 , 1 : 3 2 ) ,  Y(l:32,i:32)  -  input  matrices  to  hold  32x32 
c  blocks  of  images  1  &  2,  respectively  after  integer  to  real 
c  conversion. 

c  M( 1 : 32 , 1 : 32 ) ,  N(1:32,I:32)  -  input  matrices  to  hold  the 
c  above  data  after  byte  to  integer  conversion, 

c  3data(5i2),  Data(512)  -  512  byte  arrays  to  hold  one  record 

c  of  imagel  and  image  2 , respect ively ,  at  any  given  time. 

c  Sum(l:5)  -  array  to  hold  the  variances  for  a  5-point  search, 
c  Sum2(l:9)  -  array  to  hold  the  variances  for  the  final 
c  9-point  search. 

c  Summ,  Summ2  -  working  variables  for  the  minimum  variance 
t  over  5-  and  9=point  search,  respectively, 

c  P,  PK,  NN  -  working  variables. 

c  Q,  L  and  TQ,  TL  -  working  variables  for  the  x,y  displacement 
t  over  a  block,  and  over  the  entire  image,  respectively. 

:  VALI (1:5),  VALJ ( 1 : 5 )  and  VALI2(1:9),  VALJ2(1:9)  -  arrays  to 

:  hold  the  x,y  values  for  the  5-  and  9-point  search, 

:  resDect ively . 


implicit  real*8  (a-’n,o-z) 
implicit  inteaer*4  (i-n) 
byte  baata(512) 
byte  data  (512) 

character  *63  Infilel,  Infile2 
real  *3  sum(l:5),  sum2(l:9) 
real  *8  summ,  summ2 ,  tsum 
integer  *4  a,  p,  pk,  nn,  tq,  tl 
real  *8  X(l:43,l:48) 
real  *8  Y(l:64,l:64) 
integer  *4  val i ( 1 : 5 ) 
integer  *4  va 1 j ( 1 : 5 ) 
integer  *4  vali2(l:9) 
integer  *4  valj2(l:9) 
integer  *4  M( 1:48, 1:48) 
integer  *4  N (1:64, 1:64) 

write  (6,*)  'Type  Fi lename . type  for  the  estimate 
Read  15,1'  Infilel 

write  iS,*)  ’What  is  it  the  estimate  of?  Type  " 
for  the  estimate  file.  1 
Read  '5,1)  I  nf  i  ie2 


r*l  < 
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Initialize  variables 
npinl  =  256 
npoutl  =  32 
nroutl  =  32 
npin2  =  512 
tq=0 
1 1  =  0 

tsum  =  0.0 

do  20  npskipl  =  0,224,32 
do  10  nrskipl  =  0,224,32 

Open  all  the  units 

open  (unit=ll,  file=  infilel,  status=’old’ 
form= ' unformatted ' ,  recordtype= ' f ixed’ 
open  (unit=12,  file=  infile2,  status='old' 
fcrm= ' unformatted ’ ,  recordtype= ' f ixed ' , 


recl=64 ) 
recl=128 ) 


npout2  =  0 
nrout2  =  0 

For  each  32x32  pixel  sub-block  in  the  1st  image, 
pick  a  64x64  sub-block  in  the  2nd  image  centered 
at  the  same  spatial  location.  In  the  corners 
we  would  have  only  48  pixels  along  one  or  both  axes. 

If  (nrskipl  .ge.  32)  then 
nrskip2  =  nrskipl  -16 
else  if  (nrskipl  .eq.  0)  then 
nrskip2  =  nrskipl 
end  i  f 

If  (npskipl  .ge.  32)  then 
npskip2  =  npskipl  -16 
else  if  (npskipl  .eq.  0)  then 
npskip2  =  npskipl 
end  i  f 

If  ((nrskipl  .eq.  0)  .or.  (nrskipl  .eq.  224))  then 

nrout2  =  48 

else 

nrout2  =  64 
end  i  f 


If  ( ( npskipl  . < 
npout2  =  48 
else 

npout2  =  64 
end  i  f 

do  i  =  1 , 4  8 
do  j  =  1 , 4  8 
X (  i ,  j  )  =  0.0 


,eq.  0)  .or.  (npskipl  . eq .  224))  then 
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M  (  i  ,  j  )  =  0 
end  do 
end  do 
do  k=l , 64 
do  1=1,64 
Y(k, 1)  =  0.0 
N ( k ,  1  )  =0 
end  do 
end  do 
do  1  =  1,5 
vali(l)  =  0 
val  j ( 1 )  =  0 
end  do 
do  p  =  1,9 
vali2(p)  =  0 
val j  2 (p)  =  0 
end  do 

p=0 

k  =  0 
pk  =  0 

c  Read  Infilel. 

c  First  skip  nrskipl  records! 

do  i  =1, nrskipl 
read (11)  bdata ( 1 ) 
end  do 

c  Read  next  nroutl  records! 

do  i  =  17,  48 

read  (11)  (bdata(j),  j=l,npinl) 
k  =  1 

do  j  =  17,  48 

M  (  i ,  j  )  =  jzext(  bdata ( npsk ipl+k )  ) 

X(  i , j  )  =  df lotj (  M { i , j )  ) 
k  =  k  +1 
end  do 
end  do 

c  Cross-check. 

If  (nrskipl  . eq .  224)  then 

write  (6,890)  NPSKIP1,  NRSKIP1,  X(17,17) 

390  format  (lx,’NPSKIPl  =',14,', NRSKI ? 1  =’,I4,  ',X(1,D  =  '  , F9  . 
end  if 

c  Now  read  Infile2 

c  First  skip  nrskip2  records! 

do  i  =1,  nrskip2 
read ( 12 )  data ( i  ) 
end  do 

c  Read  next  nrout2  records! 

If  i  nrskipl  . eq .  0)  then 

nil  =  17 

else 

nil  =  1 

end  i  f 

ni2  =  nil  +  n rout 2  -1 


If  (  npskipl  .eq.  0)  then 

njl  =  17 

else 

n  j  1  =  1 

end  i  f 

nj2  =  njl  +  npout2  -1 
do  i  =  nil,  ni2 

read  (12)  (data(j),  j=l,npin2) 
k  =  1 

do  j  =  njl,  nj2 

N(i,j)  =  jzext(  data ( npskip2+k )  ) 

Y ( i , j )  =  df lot j (  N  (  i ,  j  )  ) 
k  =  k  +1 
end  do 
end  do 

Cross-check . 

If  ((npskipl  .eq.  224)  .and.  (nrskipl  .eq.  224))  then 
write  (6,891)  NPSKIP2 ,  NRSKIP2 ,  Y(nil,  njl) 
format  (lx,'NPSKIP2  = 1  , 14 , ' , NRSKIP2  = ’ ,  1 4 , ’  , Y ( 1 , 1 )  =',F 
end  i  f 

Compute  the  variances  for  the  5-point  search 

nn  =  8 
q  =  0 
1  =  0 
continue 
val i ( 1 )  =  0 
val j ( 1 )  =  0 
vali(2)  =  1  *  nn 
val j ( 2 )  =  0 
val i ( 3 )  =  0 
valj(3)  =  1  *  nn 
vali(4)  =  -1  *  nn 
val j ( 4 )  =  0 
val i ( 5 )  =  0 
val j ( 5 )  =  -1  *  nn 

cont i nue 
do  2000  p  *1,5 
sum  (p)  =  0.0 
k  =  0 
pk  =  0 

k  =  q  +  vali(p) 
pk  =  1  +  valj(p) 

Variance  =  infinity,  if  16  <  i,j  <  -16,  i.e., 
maximum  pixel  displacement  is  16  pixels. 

If  ((  k  .gt.  16)  .or.  (  k  .It.  -16)  .or. 

(pk  .gt.  16)  .or.  (pk  .It.  -16)  )  then 

sum(p)  =  1000000.0 

else 

do  200  ml  =  17,48 
do  100  nl  *  17,48 
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4000 


sum(p)  =  sum ( p )  +  (  X(ml,nl)  -  Y ( ml  +  k , n 1 +pk )  )  **  2 

cont i nue 

continue 

sum(p)  =  sum(p)  /  (32**2) 
end  if 
end  do 

Compute  the  minimum  variance 
cont i nue 
summ  =  0.0 

summ  =  SNGL(  DMIN1(  dble ( sum ( 1 ) ) ,  dble ( sum( 2 ) ) ,  dble ( sum ( 3 ) ) , 
+  dble ( sum( 4 ) ) ,  dble(sum(5))  )  ) 

Find  the  minimum. 

If  (summ  . eq.  sum(l))  then 
pq  =  l 


else  if 

( summ 

.  eq . 

sum( 2 ) ) 

then 

pq  =2 
else  if 

( summ 

.eq. 

sum( 3 )  ) 

then 

pq  =  3 
else  if 

( summ 

.  eq. 

sum( 4 ) ) 

then 

pq  =  4 
else  if 

( summ 

.  eq. 

sum( 5 )  ) 

then 

pq  =  5 
end  i  f 

q  =  q  + 

1  =  1  + 

val i (pq) 
val j (pq) 

If  ( pq  . 

eq.  1 ) 

then 

go  to  2010 

else  if  (pq  .gt.  1)  then 

go  to  2020 

end  i  f 

cont inue 

nn  =  nn/2 

If  (  nn  .eq.  1)  then 
go  to  4000 

else  if  (  nn  .gt.  1)  then 
go  to  3000 
end  i  f 
continue 
vali2(l)  =  0 
val j  2 ( 1 )  =  0 
val i 2 ( 2 )  =  1 
val j  2 ( 2 )  =  0 
val i 2 ( 3 )  =  1 
val j  2 ( 3 )  =  1 
val i 2 ( 4 )  =  0 
val j  2 ( 4 )  =  1 
val i 2 ( 5 )  =  - 1 
val j  2 ( 5 )  =  1 
val i 2 ( 6 )  =  - 1 
vaij2(6)  =  0 


va 1 i 2 ( 7 )  =  -1 
val j  2 ( 7 )  =  -1 
val  i.2  ( 8  )  =  0 
valj2(8)  =  -1 
val  i 2 ( 9 )  =  1 
valj2(9)  =  -1 

Compute  the  variances  for  the  final  9-point  search, 
i.e.,  over  -1  (=,<)  i,j  (=,<)  1. 
do  5000  p  =1,9 
sum2(p)  =  0.0 
k  =  0 
pk  =  0 

k=  q  +  vali2(p) 
pk  =  1  +  valj2(p) 

If  ((  k  ,gt.  16)  .or.  (  k  .It.  -16)  .or. 

(pk  .gt.  16)  .or.  (pk  .It.  -16)  )  then 

sum2(p)  =  1000000.0 

else 

do  500  ml  =  17,48 
do  400  nl  =  17,48 

sum2(p)  =  sum2(p)  +  (  X(ml,nl)  -  Y(ml+k , nl+pk )  )  **  2 

continue 

continue 

sum2(p)  =  sum2(p)  /  (32**2) 
end  i  f 
end  do 

Compute  the  minimum  variance 
summ2  =  0.0 

summ2  =  SNGL (  DMIN1(  dble ( sum2 ( 1 ) ) ,  dble ( sum2 ( 2 ) ) , 

dble ( sum2 ( 3 ) ) ,  dble ( sum2 ( 4 ) ) ,  dble( sum2 ( 5 ) ) ,  dble ( sum2 ( 6 ) ) 

dble ( sum2 ( 7 ) ) ,  dble ( sum2 ( 8 ) ) ,  dble ( sum2 ( 9 ) )  )) 

Find  the  minimum. 

If  (summ2  .eq.  sum2(D)  then 
pq=l 

else  if  (summ2  .eq.  sum2(2))  then 
pq  =2 

else  if  (summ2  .eq.  sum2(3))  then 
pq=3 

else  if  (summ2  .eq.  sum2(4))  then 
pq=4 

else  if  (summ2  .eq.  sum2(5))  then 
pq=5 

elseif  (summ2  .eq.  sum2(6))  then 
pq  =  6 

else  if  (summ2  .eq.  sum2(7))  then 

pq  =7 

else  if  ( summ2  .eq.  sum2(8))  then 
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pq=3 

else  if  (summ2  . eq .  sum2(S)j  then 

pq=9 

end  i  f 

q  =  q  +  vali2(pq) 

1  =  1  +  valj2(pq) 

Again,  limit  the  displacement  to  x,y  (<,  =  )  16. 


If  (  (q  .It.  (-1*16))  .or.  (q  .gt.  16) 
+  .or.  (1*  .It.  (-1*16))  )then 
sum(pq)  =  1000000.0 
q  =  q  -  val i ( pq ) 

1  =  1  -  val j (pq) 
go  to  3005 
end  i  f 


(1  . gt .  16 


tq  =  tq  +  q 
tl  =  tl  +  1 
tsum  =  tsum  +  summ2 


6001  close  (11) 
close  (12) 
1 0  END  DO 


write  (6,794)  npskipl,  nrsk ipl , nn , tq, t 1 , tsum 
format  (lx, ' N?SKIP1= ’,14,'  , NRSK I Pl= ’  , 1 4 ,  '  ,nn=',I4,/ 

’  TOTAL  (SO  FAR)  DMD  DIMENSIONS:  q  =’,I4,'  ,  1  =',I4,/ 

'  ,MIN  (SO  FAR)  D(q, 1 )  =',F10.3) 

END  DO 


write  (6,993)  infilel,  infile2 

format  (/,'THIS  PROGRAM  COMPUTES  THE  MINIMUM  VARIANCE 
BETWEEN' , A, 'AND’ ,  A,' FOR  THE  1ST  256x256  SUB-IMAGE.') 

write  (6,804)  tq,tl,tsum 

format  (/,'  TOTAL  DMD  DIMENSIONS:  q  =',I4,'  ,1  =',I4, 

'  , MIN  AVG  VAR  =' ,F10.3) 
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APPENDIX  B 


NUMERICAL  VALUES  ASSOCIATED  WITH  THE  IMAGE  PROCESSING  PROBLEM 


B.l.  Non-Zero  Elements  of  A-Matrix 


gj  A 

(1,1) 

= 

0.216 

A 

(  1,  2) 

= 

0.208 

A 

(  1,  3) 

= 

0.199 

K?  A 

(  1,  4) 

0.192 

A 

(  1,  5) 

= 

0.184 

A 

(  2,  1) 

= 

0.172 

H  A 

(  2,  2) 

— 

0.179 

A 

(  2,  3) 

= 

0.172 

A 

(  2,  4) 

= 

0.165 

s]  A 

(  2,  5) 

— 

0.159 

A 

(  2,  6) 

= 

0.152 

A 

(  3,  1) 

= 

0.142 

ft  A 

(  3,  2) 

= 

0.148 

A 

(  3,  3) 

= 

0.154 

A 

(  3,  4) 

= 

0 . 148 

|  A 

(3,5) 

zr 

0.142 

A 

(  3,  6) 

= 

0.136 

A 

(  3,  7) 

— 

0.131 

(  4,  1) 

— 

0.120 

A 

(  4,  2) 

= 

0.125 

A 

(  4,  3) 

= 

0.130 

jsj  A 

(  4,  4) 

— 

0.136 

A 

(  4,  5) 

= 

0.130 

A 

(  4,  6) 

0.125 

ft  A 

(  4,  7) 

— 

0.120 

A 

(  4,  8) 

= 

0.115 

A 

(  5,  1) 

= 

0 . 103 

A 

(  5,  2) 

- 

0.108 

A 

(  5,  3) 

= 

0.112 

A 

(  5,  4) 

= 

0.117 

LV  A 

(  5,  5) 

= 

0.122 

A 

(  5 ,  6  ) 

= 

0.117 

A 

(  5,  7) 

0.112 

K  A 

(  5,  8) 

- 

0.108 

A 

(  5,  9) 

= 

0.103 

A 

(  6,  2) 

= 

0 . 103 

w  A 

(  6,  3) 

= 

0.108 

A 

(  6,  4) 

= 

0.112 

A 

(  6,  5) 

= 

0.117 

&  A 

(  6 ,  6 ) 

j- 

0.122 

A 

(  6,  7) 

= 

0.117 

A 

(  6,  8) 

= 

0.112 

•  A 

(  6,  9) 

— 

0.108 

A 

(  6,10) 

= 

0.103 

A 

(  7,  3) 

s 

0.103 

E  A 

(  7,  4) 

— 

0.108 

A 

(  7,  5) 

= 

0.112 

A 

(  7,  6) 

= 

0.117 

\>3  . 

iw  A 

(  7,  7) 

— 

0.122 

A 

(  7,  8) 

= 

0.117 

A 

(  7,  9) 

= 

0.112 

>\  A 

(  7,10) 

=5 

0.108 

A 

(  7,11) 

= 

0.103 

A 

(  8,  4) 

= 

0.103 

fe  A 

(  8,  5) 

= 

0.108 

A 

(  8,  6) 

= 

0.112 

A 

(  8,  7) 

= 

0.117 

(  8,  8) 

= 

0.122 

A 

(  8,  9) 

= 

0.117 

A 

(  8,10) 

= 

0.112 

n  A 

(  8,11) 

- 

0.108 

A 

(  8,12) 

= 

0.103 

A 

(  9,  5) 

0.103 

Kj  A 

(  9,  6) 

52 

0.108 

A 

(  9,  7) 

= 

0.112 

A 

(  9,  8) 

= 

0.117 

(  9,  9) 

— 

0.122 

A 

(  9,10) 

= 

0.117 

A 

(  9,11) 

= 

0.112 

A 

(  9,12) 

— 

0.108 

A 

(  9,13) 

= 

0.103 

A 

(10,  6) 

— 

0.103 

ft  A 

(10,  7) 

= 

0.108 

A 

(10,  8) 

= 

0.112 

A 

(10,  9) 

= 

0.117 

M  A 

(10,10) 

0.122 

A 

(10,11) 

= 

0.117 

A 

(10,12) 

= 

0 .112 

5  A 

(10,13) 

= 

0.108 

A 

(10,14) 

= 

0.103 

A 

(11,  7) 

= 

0 .103 

£  a 

(11,  8) 

= 

0.108 

A 

(11,  9) 

= 

0.112 

A 

(11,10) 

= 

0.117 

W  A 

(11,11) 

s 

0.122 

A 

(11,12) 

= 

0.117 

A 

(11,13) 

= 

0.112 

r$  A 

( 11,14) 

- 

0.108 

A 

( 11,15) 

= 

0.103 

A 

(12,  8) 

= 

0 . 103 

(12,  9) 

- 

0.108 

A 

(12,10) 

= 

0.112 

A 

(12,11) 

= 

0.117 

^  A 

(12,12) 

25 

0.122 

A 

(12,13) 

= 

0.117 

A 

(12,14) 

0.112 

TO  A 

(12,15) 

= 

0.108 

A 

(12,16) 

= 

0.103 

A 

(13,  9) 

= 

0.103 

S  A 

(13,10) 

— 

0.108 

A 

(13,11) 

= 

0.112 

A 

(13,12) 

= 

0 .117 

ft  A 

(13,13) 

rz 

0.122 

A 

(13,14) 

= 

0.117 

A 

(13,15) 

= 

0.112 

•t-  j 

M  A 

(13,16) 

— 

0.108 

A 

(13,17) 

= 

0.103 

A 

(14,10) 

= 

0 . 103 

/yl  V 

i-i  A 

(14,11) 

- 

0.108 

A 

(14,12) 

= 

0.112 

A 

(14,13) 

= 

0.117 

A  . 

.A  A 

(14,14) 

25 

0.122 

A 

(14,15) 

= 

0.117 

A 

(14,16) 

~ 

0.112 

w 

(14,17) 

- 

0.108 

A 

(14,18) 

= 

0.103 

A 

(15,11) 

= 

0.103 

A 

(15,12) 

- 

0.108 

A 

(15,13) 

= 

0.112 

A 

( 15,14) 

s: 

0.117 

b/  A 

(15,15) 

— 

0.122 

A 

(15,16) 

= 

0.117 

A 

(15,17) 

= 

0.112 

A 

(15,18) 

=: 

0.108 

A 

(15,19) 

= 

0.103 

A 

(16,12) 

— 

0.103 

t>‘  A 

(16,13) 

25 

0.108 

A 

(16,14) 

= 

0.112 

A 

(16,15) 

= 

0.117 

|M  A 

(16,16) 

- 

0.122 

A 

(16,17) 

= 

0.117 

A 

(16,18) 

=: 

0.112 

W  A 

(16,19) 

— 

0. 108 

A 

(16,20) 

= 

0.103 

A 

(17,13) 

= 

0.  103 

tV  A 

(  17  ,  14  ) 

2= 

0.108 

A 

(17,15) 

= 

0.112 

A 

(17,16) 

0.117 

tv  A 

(17,17) 

25 

0.122 

A 

(17,18) 

= 

0.117 

A 

(  17,19) 

0.112 

($  A 

(17,20) 

= 

0.108 

A 

(17,21) 

= 

0.103 

A 

(18,14) 

0.103 
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